Entries |
Document | Title | Date |
20080208482 | Display Producing System - A system for producing a display of the DNA identity of an individual comprises means for inputting a plurality of numbers and/or letters obtained through an individual's DNA analysis which are representative of a plurality of DNA loci; means for correlating inputted numbers and/or letters with predetermined colours; and means for producing a display with an array of coloured regions representative of each DNA locus and positioning said coloured regions in the same location of the display for each individual for which a display is produced. | 08-28-2008 |
20080215250 | Molecular toxicology modeling - The present invention is based on the elucidation of the global changes in gene expression and the identification of toxicity markers in tissues or cells exposed to a known toxin. The genes may be used as toxicity markers in drug screening and toxicity assays. The invention includes a database of genes characterized by toxin-induced differential expression that is designed for use with microarrays and other solid-phase probes. | 09-04-2008 |
20080215251 | Computational method for choosing nucleotide sequences to specifically silence genes - A method for identifying subsequences in a polynucleotide sequence for specifically silencing a target gene is provided. The method is described for identifying sequences effective in silencing a target gene or a series of genes, but not others. Subsequences can be identified and scored using comparisons based on percent sequence identity with respect to a target reference sequence and siRNA algorithm analysis. The resulting subsequences may be ranked based on score, percent sequence identity. The identification of subsequences may be performed using a sliding window to identify all subsequences of a set length within the sequence. A user interface may be provided for displaying the results to a user. | 09-04-2008 |
20080215252 | Method of Determining Base Sequence of Nucleic Acid and Apparatus Therefor - In a preferred embodiment, an exploring needle of a probe | 09-04-2008 |
20080228410 | Genetic attribute analysis - A bioinformatics method, software, database and system for genetic attribute analysis are presented in which non-identical sets of genetic attributes comprising nucleotide sequences are compared to determine whether proteins encoded by those nucleotide sequences are functionally equivalent and, therefore, whether genetic information contained in the sets of genetic attributes can be considered equivalent. Sets of genetic attributes are determined to be equivalent based on whether they are able to satisfy one or more predetermined equivalence rules for comparing non-identical protein-encoding nucleotide sequences. A determination of equivalence between sets of genetic attributes can enable the compression of thousands of individual DNA nucleotide attributes into a single categorical attribute, as well as enable determinations of co-association of attributes, predisposition prediction and predisposition modification of individuals. | 09-18-2008 |
20080243396 | Systems and methods for sub-genomic region specific comparative genome hybridization probe selection - Systems and methods for using the same to select one or more comparative genome hybridization (CGH) probes specific for a sub-genomic region of interest are provided. Also provided are computer program products for executing the subject methods. | 10-02-2008 |
20080243397 | SOFTWARE FOR DESIGN AND VERIFICATION OF SYNTHETIC GENETIC CONSTRUCTS - The present invention provides methods for designing and verifying nucleic acid molecules having one or more desired properties. The methods are typically encoded into software, and typically include use of databases and algorithms to determine if nucleic acid molecules designed to have various elements in functional relationships have the intended properties. The result is achieved by determining if the various elements of the designed nucleic acid are in the correct order and physical relationship to other elements, and that the proper elements are selected. Computer systems for implementing the method, as well as business methods for reaping monetary gain from use of the methods, are also disclosed. | 10-02-2008 |
20080243398 | System and method for cleaning noisy genetic data and determining chromosome copy number - Disclosed herein is a system and method for increasing the fidelity of measured genetic data, for making allele calls, and for determining the state of aneuploidy, in one or a small set of cells, or from fragmentary DNA, where a limited quantity of genetic data is available. Genetic material from the target individual is acquired, amplified and the genetic data is measured using known methods. Poorly or incorrectly measured base pairs, missing alleles and missing regions are reconstructed using expected similarities between the target genome and the genome of genetically related individuals. In accordance with one embodiment of the invention, incomplete genetic data from an embryonic cell are reconstructed at a plurality of loci using the more complete genetic data from a larger sample of diploid cells from one or both parents, with or without haploid genetic data from one or both parents. In another embodiment of the invention, the chromosome copy number can be determined from the measured genetic data of a single or small number of cells, with or without genetic information from one or both parents. In another embodiment of the invention, these determinations are made for the purpose of embryo selection in the context of in-vitro fertilization. In another embodiment of the invention, the genetic data can be reconstructed for the purposes of making phenotypic predictions. | 10-02-2008 |
20080255767 | Method and Device For Detection of Splice Form and Alternative Splice Forms in Dna or Rna Sequences - The invention relates to a method and a device for detection of splice sites in DNA or RNA sequences comprising three steps: a) examining a training set of sequences comprising DNA or RNA sequences with known splice sites by an automated, discriminative training device for detecting splicing patterns, especially in a predetermined window around the known splice sites; b) scanning a sequence comprising DNA or RNA sequences containing unknown splice sites for the occurrence of the splicing patterns detected in step a); and c) calculation of a cumulative splice score in dependence of a maximization of the margin between the true splice forms and all wrong splice forms in the sequence. The invention also relates to a method and a device for detection of splice forms and alternative splice forms in DNA or RNA sequences. | 10-16-2008 |
20080255768 | METHODS OF DETERMINING RELATIVE GENETIC LIKELIHOODS OF AN INDIVIDUAL MATCHING A POPULATION - Provided are methods of determining an individual's relative likelihood of having a genetic match with one or more local populations as compared to a generic index population. Also provided are systems, apparatuses, kits, and machine-readable medium relating to such methods. The methods may be used for example, to identify an individual's or individual's ancestor's most likely geographic origin, or to identify the breed, species, kingdom, etc. of an organism. | 10-16-2008 |
20080262747 | NUCLEIC ACID SEQUENCING SYSTEM AND METHOD - A technique for sequencing nucleic acids in an automated or semi-automated manner is disclosed. Sample arrays of a multitude of nucleic acid sites are processed in multiple cycles to add nucleotides to the material to be sequenced, detect the nucleotides added to sites, and to de-block the added nucleotides of blocking agents and tags used to identify the last added nucleotide. Multiple parameters of the system are monitored to enable diagnosis and correction of problems as they occur during sequencing of the samples. Quality control routines are run during sequencing to determine quality of samples, and quality of the data collected. | 10-23-2008 |
20080270041 | SYSTEM AND METHOD FOR BROAD-BASED MULTIPLE SCLEROSIS ASSOCIATION GENE TRANSCRIPT TEST - Broad-based gene association transcript test for multiple sclerosis and data structure. Multiple sclerosis considerations for this unique test include a custom set of genetic sequences associated in peer-reviewed literature with various known multiple sclerosis related to exposure to toxic substances. Such multiple sclerosis symptoms include specific genetic expressions linked to symptoms of the disease. The base dataset may be developed through clinical samples obtained by third-parties. Online access of real-time phenotype/genotype associative testing for physicians and patients may be promoted through an analysis of a customized microarray testing service. | 10-30-2008 |
20080275652 | GENE-BASED ALGORITHMIC CANCER PROGNOSIS - Gene-Based Algorithmic Cancer Prognosis relates to methods and systems for prognosis determination in tumor samples. The methods and systems measure gene expression in a tumor sample and applying a gene-expression grade index (GGI) or a relapse score (RS) to yield a number c risk score. | 11-06-2008 |
20080281530 | GENOMIC DATA PROCESSING UTILIZING CORRELATION ANALYSIS OF NUCLEOTIDE LOCI - Processing of genomic data is provided utilizing correlation analysis of first and second nucleotide loci employing a selected comparison type and value. The comparison type is either intersection or proximity type, and the comparison value is either a number (n) of nucleotide positions, wherein n≧1, or a percent number (pn) of nucleotide positions, wherein pn≧0, to be employed in comparing the loci. When intersection type is selected, correlation is defined by the loci overlapping with at least the number (n) of nucleotide positions in common, or by the loci overlapping with at least the percent number (pn) of nucleotide positions in common relative to a smaller one of the first and second loci, or when proximity type is selected, correlation is defined by the first and second loci being within at least the number (n) of nucleotide positions. | 11-13-2008 |
20080281531 | Method for Diagnosing Depression - This invention relates to a method for diagnosing whether or not a subject suffers from depression in a simple manner with high accuracy using the peripheral whole blood sample of the subject. Specifically, the present invention relates to a method for diagnosing depression comprising the steps of: measuring expression levels of 18 genes selected from the group consisting of FASLG; CX3CR1, TBX21, ID2, SLAMF7, PRSS23, YWHAQ, TARDBP, ADRB2, PPP1R8, MMAA, SQLE, PDHA1, HAVCR2, RACGAP1, AHNAK, EDG8, and DUSP5, in peripheral blood isolated from a subject; and determining whether or not the subject suffers from depression based on the expression levels of the 18 genes. | 11-13-2008 |
20080288177 | PHARACOGENETIC METHOD FOR PREDICTION OF THE EFFICACY OF METHOTREXATE MONOTHERAPY IN RECENT-ONSET ARTHRITIS - Pharmacogenetic methods for determining a predicting responsiveness to antifolate therapy for subjects that present with recent-onset undifferentiated arthritis. The methods are based on the determination of a set of clinical parameter values and determining a predicted responsiveness to antifolate therapy by correlating the parameter values with predefined responsiveness values associated with ranges of parameter values. Parameters values that are decisive for responsiveness to antifolate therapy may include polymorphisms in the methylenetetrahydrofolate dehydrogenase (MTHFD1) gene as well as in three genes involved in the adenosine release pathway, the presence or absence of Rheumatoid factors, gender, pre- or postmenopausal status and/or smoking status. | 11-20-2008 |
20080288178 | Sequencing system with memory - The present teachings provide a device including a memory. According to various embodiments, the memory is readable, writable, and rewritable. The present teachings further provide processing stations, e.g., for carrying out electrophoresis, pcr, genetic analysis, sample preparation, and/or sample cleanup, etc., that are capable of reading from and/or writing/rewriting to such memory. | 11-20-2008 |
20080288179 | NORMALIZATION OF DATA - Methods for normalizing output from an instrument employing a reference standard or non-fluorescing substance disposed within at least one of a plurality of reaction chambers. The method comprises collecting and analyzing a signal associated with the reference standard or non-fluorescing substance to determine a normalizing bias. The normalizing bias is then applied to the data signal collected from a remainder of the plurality of reaction chambers. | 11-20-2008 |
20080319679 | SYSTEMS AND METHODS FOR ANALYZING MICROARRAYS - The present invention discloses methods and systems for analyzing microarray data. The method includes the general steps of providing microarray data, normalizing the data using a least trimmed squares regression, and then analyzing the normalized microarray data to obtain a desired result such as an expression profile. There is also disclosed a method of subdividing an array into subarrays before normalization. This approach provides a method for improving measurement accuracy and salvaging array data from arrays containing minor defects. Also disclosed is a Probe-Treatment-Reference (PTR) model for streamlining normalization and summarization of microarray data by allowing multiple references. Other aspects of the present invention include computer systems and computer readable media encoding methods of the present invention. | 12-25-2008 |
20090006002 | COMPARATIVE SEQUENCE ANALYSIS PROCESSES AND SYSTEMS - Provided herein are processes for rapidly identifying or determining sequence information in a sample nucleic acid by comparing sample nucleic acid sequence information to reference nucleic acid sequence information or information obtained from reference samples. Also provided are automated systems for conducting comparative sequence analyses. | 01-01-2009 |
20090012719 | SYSTEM AND METHOD FOR IDENTIFICATION OF SYNERGISTIC INTERACTIONS FROM CONTINUOUS DATA - Systems and methods for selecting factors from a continuous data set of measurements are provided. The measurements include values of factors and/or outcomes. Two or more factors that are jointly associated with one or more outcomes from the data set are identified. Each of the two or more factors are analyzed to determine at least one cooperative interaction among the factors with respect to an outcome. The two or more factors can be a module of factors serving as a single factor participating in a cooperative interaction with another factor or module of factors. | 01-08-2009 |
20090012720 | System and Method for Identification of MicroRNA Target Sites and Corresponding Targeting MicroRNA Sequences - A method for determining whether a nucleotide sequence contains a microRNA binding site and which microRNA will bind thereto is provided. For example, in one aspect of the invention, a method for determining whether a nucleotide sequence contains a microRNA binding site and which microRNA sequence will bind thereto is comprised of the following steps. One or more patterns are generated by processing a collection of known mature microRNA sequences. The reverse complement of each generated patter is then computed. One or more attributes are then assigned to the reverse complement of the one or more generated patterns. The one or more patterns that correspond to a reverse complement having one or more assigned attributes that satisfy at least one criterion are thereafter subselected. Each subselected pattern is then used to analyze the nucleotide sequence, such that a determination is made whether the nucleotide sequence contains a microRNA binding site and which microRNA sequence will bind thereto. | 01-08-2009 |
20090024333 | Methods and systems relating to mitochondrial DNA phenotypes - In one aspect, a system includes, but is not limited to, at least one computer program for use with at least one computer system and wherein the computer program includes a plurality of instructions, including but not limited to, one or more instructions for determining at least one correlation between at least one mitochondrial DNA-influencing event and at least one aspect of mitochondrial DNA phenotype information regarding at least one individual. | 01-22-2009 |
20090024334 | Database for analyzing gene function and method of analyzing gene function by DSPA - A method of constructing a gene function database comprising measuring the cell viabilities, against a plural number of drugs at various concentrations, of transformed eukaryotic cells overexpressing a plural number of function-known genes and parental cell line thereof, calculating the ratios of IC | 01-22-2009 |
20090037117 | Differential Dissociation and Melting Curve Peak Detection - Systems and methods are provided for processing a melting or dissociation curve of a DNA or other sample, for example, during PCR processing. In some embodiments, detection of the melting point and melting curve behavior can be enhanced by taking a derivative of the curve, and detecting peaks in the differential dissociation curve. In some embodiments, the derivative operation can comprise the use of edge-processing, or other detection algorithms. In some embodiments, the dissociation analysis can comprise removing low-frequency (or pedestal) components of the differential dissociation curve. In some embodiments, the differential dissociation curve can exhibit a smoothed or more regular appearance than the raw detected data. | 02-05-2009 |
20090037118 | METHODS FOR PREDICTING THREE-DIMENSIONAL STRUCTURES FOR ALPHA HELICAL MEMBRANE PROTEINS AND THEIR USE IN DESIGN OF SELECTIVE LIGANDS - A method for practical prediction of the three-dimensional structure of α-helical membrane proteins (HMPs) is described. The method allows one to predict the binding site and structure for strongly bound ligands. The method combines a protocol of computational methods enabling a complete ensemble of packings to be sampled and systematically reducing this ensemble to progressively more accurate structures until at the end there remain a few that might be functionally relevant and likely to play a role in all binding and activation processes. This method is well suited to automatic operation making it practical to obtain, for example, the ensemble of important structures for all human GPCRs. With this ensemble of all active GPCR structures in the human body, an infimum method is presented to maximize efficacy toward the selected target while minimizing binding to all other GPCRs to eliminate toxicity arising from cross-reacting with other GPCRs (a most common source of drug failure). This infimum method is broadly applicable to any set of proteins where a ligand is desired to be able to modulate the function of one protein while not affecting the function of other proteins. | 02-05-2009 |
20090048785 | Methods And Systems For Analyzing Biological Samples - Methods, computer readable storage media and systems which can be used for analyzing labeled biological samples, identifying chromosomal aberrations, identifying genetically abnormal cells and/or computationally scanning the samples using randomly or randomized scanning methods are provided. Specifically, the present invention can be used to analyze FISH-stained samples and automatically identify chromosomal aberrations associated with abnormal intensity ratio of stained occurrences in the sample. | 02-19-2009 |
20090076734 | Gene Signature for the Prediction of Radiation Therapy Response - Described are mathematical models and method, e.g., computer-implemented methods, for predicting tumor sensitivity to radiation therapy, which can be used, e.g., for selecting a treatment for a subject who has a tumor. | 03-19-2009 |
20090076735 | Method, system and software arrangement for comparative analysis and phylogeny with whole-genome optical maps - The present invention provides a method for organizing genomic information from multiple organisms. In one embodiment of the invention, phylogenetic trees can be constructed for the organisms. The method of the present invention is termed CAPO, Comparative Analysis and Phylogeny with Optical-Maps. Optical maps of organisms are obtained and phylogeny between the organisms is determined by optical map comparison and bipartite graph matching between the organisms, as, for example, computed by a stable marriage algorithm. | 03-19-2009 |
20090082975 | METHOD OF SELECTING AN ACTIVE OLIGONUCLEOTIDE PREDICTIVE MODEL - The present invention provides a method of identifying a predictor of antisense oligonucleotide activity by identifying properties of oligonucleotides, evaluating oligonucleotide activity of the oligonucleotides, and correlating oligonucleotide activity with the properties. A high correlation between oligonucleotide activity and a property indicates that the property is a predictor of oligonucleotide activity. | 03-26-2009 |
20090099789 | Methods and Systems for Genomic Analysis Using Ancestral Data - The present disclosure provides methods and systems for assessing an individual's genotype correlations to a phenotype by analyzing the individual's genomic profile and using ancestral data to determine the correlations between genotypes and phenotypes. | 04-16-2009 |
20090105961 | METHODS OF NUCLEIC ACID IDENTIFICATION IN LARGE-SCALE SEQUENCING - The present invention provides methods for determining a base probability in a target nucleic acid within an experimental data set. The methods of the invention provide specific methods of improving accuracy of base calling for experimental sequencing data compared to conventional methods. The experimental base values used in the methods of the present invention provide relative base probabilities within an experimental data set that are robust and uniformly optimal regardless of the experimental conditions. | 04-23-2009 |
20090105962 | METHODS AND SYSTEMS FOR IDENTIFYING MOLECULAR PATHWAY ELEMENTS - The present invention provides methods and systems useful in determining molecular pathways and elements of molecular pathways. In particular, the present invention provides for the discovery of new molecular pathway elements using Bayesian networks and gene expression information. | 04-23-2009 |
20090119022 | Emericella Nidulans Genome Sequence On Computer Readable Medium and Uses Thereof - The present invention relates to nucleic acid sequences from the filamentous fungus, | 05-07-2009 |
20090125244 | BROAD-BASED NEUROTOXIN-RELATED GENE MUTATION ASSOCIATION FROM A GENE TRANSCRIPT TEST - Broad-based genetic mutation association gene transcript test and data structure. Genetic mutation considerations for this unique test include a custom set of genetic sequences associated in peer-reviewed literature with various known genetic mutation related to exposure to toxic substances. Such genetic mutations include specific gene sequence alterations based on exposure to diesel fuel, aviation fuel, jet fuel, and many other toxic substances often needed in the aviation and refining industries. The base dataset may be developed through clinical samples obtained by third-parties. Online access of real-time phenotype/genotype associative testing for physicians and patients may be promoted through an analysis of a customized microarray testing service. | 05-14-2009 |
20090125245 | Methods For Rapid Forensic Analysis Of Mitochondrial DNA - The present invention provides methods for rapid forensic analysis of mitochondrial DNA by amplification of a segment of mitochondrial DNA containing restriction sites, digesting the mitochondrial DNA segments with restriction enzymes, determining the molecular masses of the restriction fragments and comparing the molecular masses with the molecular masses of theoretical restriction digests of known mitochondrial DNA sequences stored in a database. | 05-14-2009 |
20090125246 | Method and Apparatus for the Determination of Genetic Associations - Procedure and tool to determine genetic associations. The method allows to identify, without the need for predictive hypothesis, genes that influence, either individually or preferably collectively, the appearance of any phenotypic trait shared by several groups of individuals; groups in each of which the characteristic appears in a different context as they can be different diseases, a different reaction to the same treatment or different manifestations of the same disease. For each phenotypic context, a study is carried out of cases and controls, giving rise to associations of genes or combinations of genes with statistical significance. These associations are filtered, eliminating those that also appear when comparing controls versus controls. Of the remaining associations, those that have appeared in all the cases and controls are selected, preferably rationalized, and are validated by analysing their presence in larger groups. | 05-14-2009 |
20090125247 | GENE EXPRESSION MARKERS OF RECURRENCE RISK IN CANCER PATIENTS AFTER CHEMOTHERAPY - The present invention relates to genes, the expression levels of which are correlated with likelihood of breast cancer recurrence in patients after tumor resection and chemotherapy. | 05-14-2009 |
20090125248 | System, Method and computer program product for integrated analysis and visualization of genomic data - Described is a system for analysis and visualization of genomic data. The system allows a user to select at least one individual sample. The sample has chromosomal data representing a genome with a chromosome and also includes chromosomal measurements of at least one event at a particular location on the chromosome. A frequency of event is generated based on the selected sample. The frequency of event is a frequency of occurrence of the event in the selected sample. At least one annotation can be selected that includes chromosomal region specific information as related to the chromosome. Finally, the chromosomal data, the annotation, and the frequency of event on a display can all be simultaneously displayed, thereby allowing a user to view chromosomal region specific information with respect to a particular chromosomal event. | 05-14-2009 |
20090138209 | PROGNOSTIC APPARATUS, AND PROGNOSTIC METHOD - A computer-readable storage medium storing a program causing a computer to execute, (a) extracting prediction factors from gene expression data, (b) predicting based on gene expression data of a patient to be prognosticated, whether expression levels of the prediction factors of the patient are similar to the expression levels of a good prognosis group or the expression levels of a poor prognosis group, and (c) extracting prediction factors indicating a poor prognosis from the prediction factors of the patient as poor prognosis determining factors. Poor prognosis determining factors are extracted in which increase and decrease trends of the expression levels coincide with increase and decrease trends of expression levels supposed when abnormal phenomena related to predetermined diseases occur, and the poor prognosis determining factors extracted for the respective abnormal phenomena are outputted. | 05-28-2009 |
20090150084 | GENOME IDENTIFICATION SYSTEM - The present invention belongs to the field of genomics and nucleic acid sequencing. It involves a novel method of sequencing biological material and real-time probabilistic matching of short strings of sequencing information to identify all species present in said biological material. It is related to real-time probabilistic matching of sequence information, and more particular to comparing short strings of a plurality of sequences of single molecule nucleic acids, whether amplified or unamplied, whether chemically synthesized or physically interrogated, as fast as the sequence information is generated and in parallel with continuous sequence information generation or collection. | 06-11-2009 |
20090222216 | System and Method to Improve Accuracy of a Polymer - The sequencing of individual monomers (e.g., a single nucleotide) of a polymer (e.g., DNA, RNA) is improved by reducing the motion of the polymer due to thermally-driven diffusion to reduce the spatial error in the position of the polymer within a measurement device. A major system parameter, such as average translocation velocity or measurement time, is selected based on the characteristics of the sensing system utilized, and an algorithm jointly optimizes the sequencing order error rate and the monomer identification error rate of the system. | 09-03-2009 |
20090240441 | SYSTEM AND METHOD FOR ANALYSIS AND PRESENTATION OF GENOMIC DATA - A method for analyzing genomic data that includes obtaining genomic sequence information from an anonymous individual, processing the information via a secure computerized algorithm, and presenting phenotypic information to the individual based upon the genomic sequence information. | 09-24-2009 |
20090287420 | Method and system to characterize transcriptionally active regions and quantify sequence abundance for large scale sequencing data - This invention provides a quantitative method to determine transcriptionally active regions and quantify sequence abundance from large scale sequencing data. The invention also provides a system based on reference sequences to design and implement the method. The system processes large scale sequence data from high throughput sequencing, generates transcriptionally active region sequences as necessary, and quantifies the sequence abundance of the gene or transcriptionally active region. The method and system are useful for many analyses based on RNA expression profiling. | 11-19-2009 |
20090292482 | Methods and Systems for Generating Cell Lineage Tree of Multiple Cell Samples - A method of generating a cell lineage tree of a plurality of cells of an individual is provided. The method comprising: (a) determining at least one genotypic marker for each cell of the plurality of cells; and (b) computationally clustering data representing the at least one genotypic marker to thereby generate the cell lineage tree of the plurality of cells of the individual. | 11-26-2009 |
20090299650 | SYSTEMS AND METHODS FOR FILTERING TARGET PROBE SETS - Systems and methods for using the same to filter target probes sets are provided. In certain embodiments, the system and methods are implemented on a web-based platform. Also provided are computer program products for executing the subject methods. | 12-03-2009 |
20090326832 | GRAPHICAL MODELS FOR THE ANALYSIS OF GENOME-WIDE ASSOCIATIONS - Systems and methods are provided for the identification of genotype-phenotype associations in genome-wide association (GWA) studies. In an illustrative implementation, a data correlation environment comprises a population structure engine and at least one instruction set to instruct the population structure engine to process pedigree or population genetic data to generate a population structure sub-model according to a selected graphical model-based data correlation paradigm. Illustratively, the parameter of the resulting generalized linear mixed model can be learned using a variational approximation. | 12-31-2009 |
20100010749 | SEQUENCING - The invention relates to improvements in sequencing of polymers. In particular, the invention relates to a method of sequencing a polymer, the method comprising providing a plurality of data sets, each set comprising data representing the concentration of synthesised polymers from a plurality of chain termination reactions, wherein the data sets include termination artefacts; aligning two or more of the data sets based on at least one termination artefact present in said two or more data sets; and determining the polymer sequence based on the aligned data. | 01-14-2010 |
20100049449 | Pattern Discovery Techniques for Determining Maximal Irredundant and Redundant Motifs - Basis motifs are determined from an input sequence through an iterative technique that begins by creating small solid motifs and continues to create larger motifs that include “don't care” characters and that can include flexible portions. The small solid motifs, including don't care characters and flexible portions, are concatenated to create larger motifs. During each iteration, motifs are trimmed to remove redundant motifs and other motifs that do not meet certain criteria. The process is continued until no new motifs are determined. At this point, the basis set of motifs has been determined. The basis motifs are used to construct redundant motifs. The redundant motifs are formed by determining a number of sets for selected basis motifs. From these sets, unique intersection sets are determined. The redundant motifs are determined from the unique intersection sets and the basis motifs. This process continues, by selecting additional basis motifs, until all basis motifs have been selected. | 02-25-2010 |
20100057372 | RAPID IDENTIFICATION OF PROTEINS AND THEIR CORRESPONDING SOURCE ORGANISMS BY GAS PHASE FRAGMENTATION AND IDENTIFICATION OF PROTEIN BIOMARKERS - Embodiments of the present invention relate to the identification of proteins using laser desorption ionization mass spectrometry, the identification of source organisms comprising the identified proteins and a computer readable storage medium storing instructions that, when executed by a computer cause the computer to perform a method for the identification of proteins using mass spectra generated through the application of laser desorption ionization mass spectrometry of the proteins. | 03-04-2010 |
20100057373 | Pattern Discovery Techniques for Determining Maximal Irredundant and Redundant Motifs - Basis motifs are determined from an input sequence through an iterative technique that begins by creating small solid motifs and continues to create larger motifs that include “don't care” characters and that can include flexible portions. The small solid motifs, including don't care characters and flexible portions, are concatenated to create larger motifs. During each iteration, motifs are trimmed to remove redundant motifs and other motifs that do not meet certain criteria. The process is continued until no new motifs are determined. At this point, the basis set of motifs has been determined. The basis motifs are used to construct redundant motifs. The redundant motifs are formed by determining a number of sets for selected basis motifs. From these sets, unique intersection sets are determined. The redundant motifs are determined from the unique intersection sets and the basis motifs. This process continues, by selecting additional basis motifs, until all basis motifs have been selected. | 03-04-2010 |
20100057374 | Genotype calling - Determining a genetic sequence for a particular site on an individual's genome is disclosed, including: receiving a measurement associated with a particular sequence for the particular site on the individual's genome, receiving contextual information associated with a context of the individual within a larger collection of genetic information, and using the measurement associated with the particular sequence and the contextual information to compute an improved determination of the genetic sequence at the particular site on the individual's genome. | 03-04-2010 |
20100094563 | System and Method for Consensus-Calling with Per-Base Quality Values for Sample Assemblies - The present teachings disclose a method for evaluation of a polynucleotide sequence using a consensus-based analysis approach. The sequence analysis method utilizes quality values for a plurality of aligned sequence fragments to identify consensus basecalls and calculate associated consensus quality values. The disclosed method is applicable to resolution of single nucleotide polymorphisms, mixed-based sequences, heterozygous allelic variants, and heterogeneous polynucleotide samples. | 04-15-2010 |
20100121582 | METHODS FOR ACCURATE SEQUENCE DATA AND MODIFIED BASE POSITION DETERMINATION - Disclosed herein are methods of determining the sequence and/or positions of modified bases in a nucleic acid sample present in a circular molecule with a nucleic acid insert of known sequence comprising obtaining sequence data of at least two insert-sample units. In some embodiments, the methods comprise obtaining sequence data using circular pair-locked molecules. In some embodiments, the methods comprise calculating scores of sequences of the nucleic acid inserts by comparing the sequences to the known sequence of the nucleic acid insert, and accepting or rejecting repeats of the sequence of the nucleic acid sample according to the scores of one or both of the sequences of the inserts immediately upstream or downstream of the repeats of the sequence of the nucleic acid sample. | 05-13-2010 |
20100125421 | System and method for determining a dosage for a treatment - An system and method for allowing the real-time diagnostics of various genotype-related treatments while allowing for the changing of demographic data such as a person's age, weight, etc. Various embodiments and methods of new processes include the assembly and association of genetic material samples, the preparation of microarrays with representative genetic material samples in a pattern best suited for analysis as well as manipulation, and delivery of assimilated and compiled data in the form of an electronic document for determining a dosage for a treatment. | 05-20-2010 |
20100138165 | Noninvasive Diagnosis of Fetal Aneuploidy by Sequencing - Disclosed is a method to achieve digital quantification of DNA (i.e., counting differences between identical sequences) using direct shotgun sequencing followed by mapping to the chromosome of origin and enumeration of fragments per chromosome. The preferred method uses massively parallel sequencing, which can produce tens of millions of short sequence tags in a single run and enabling a sampling that can be statistically evaluated. By counting the number of sequence tags mapped to a predefined window in each chromosome, the over- or under-representation of any chromosome in maternal plasma DNA contributed by an aneuploid fetus can be detected. This method does not require the differentiation of fetal versus maternal DNA. The median count of autosomal values is used as a normalization constant to account for differences in total number of sequence tags is used for comparison between samples and between chromosomes. | 06-03-2010 |
20100169026 | Algorithms for sequence determination - The present invention is generally directed to powerful and flexible methods and systems for consensus sequence determination from replicate biomolecule sequence data. It is an object of the present invention to improve the accuracy of consensus biomolecule sequence determination from replicate sequence data by providing methods for assimilating replicate sequence into a final consensus sequence more accurately than any one-pass sequence analysis system. | 07-01-2010 |
20100262379 | Sequencing System With Memory - The present teachings provide a device including a memory. According to various embodiments, the memory is readable, writable, and rewritable. The present teachings further provide processing stations, e.g., for carrying out electrophoresis, per, genetic analysis, sample preparation, and/or sample cleanup, etc., that are capable of reading from and/or writing/rewriting to such memory. | 10-14-2010 |
20100268478 | Sequencing Nucleic Acid Polymers with Electron Microscopy - This invention relates to using an electron microscope to sequence by direct inspection of labeled, stretched DNA. This method will have higher accuracy, lower cost, and longer read length than current DNA sequencing methods. | 10-21-2010 |
20100292933 | Methods, systems, and compositions for classification, prognosis, and diagnosis of cancers - The present invention provides methods, systems and compositions for predicting disease susceptibility in a patient. In some embodiments, methods for the classification, prognosis, and diagnosis of cancers are provided. In other embodiments, the present invention provides statistical methods for building a gene-expression-based classifier that may be employed for predicting disease susceptibility in a patient, for classifying carcinomas, and for the prognosis of clinical outcomes. | 11-18-2010 |
20110015870 | Information Processing System Using Nucleotide Sequence-Related Information - This invention constructs a highly safe system for processing information for providing semantic information and/or information associated with the semantic information useful for each individual organism through effective utilization of differences in nucleotide sequence-related information among individual organisms. This system comprises steps of: (a) obtaining positional information representing a position in a nucleotide sequence in accordance with a request for an object and/or service; and (b) evaluating adequacy of transmission of nucleotide sequence-related information corresponding to the positional information obtained in step (a), based on the flag information associated with the positional information for evaluating adequacy of transmission of nucleotide sequence-related information associated with the positional information representing a position in a nucleotide sequence. | 01-20-2011 |
20110071767 | Hepatotoxicity Molecular Models - The present invention includes methods of predicting hepatotoxicity of test agents and methods of generating hepatotoxicity prediction models using algorithms for analyzing quantitative gene expression information. The invention also includes microarrays, computer systems comprising the toxicity prediction models, as well as methods of using the computer systems by remote users for determining the toxicity of test agents. | 03-24-2011 |
20110106454 | METHOD AND SYSTEM FOR COMPARATIVE GENOMICS - A method and system for representing a similarity between at least two genomes that includes detecting gene clusters which are common to the at least two genomes and representing the common gene clusters in a PQ tree. The PQ tree includes a first internal node (P node), that allows permutation of the children thereof, and a second internal node (Q node), that maintains unidirectional order of the children thereof. | 05-05-2011 |
20110125411 | Uniquemer Algorithm for Identification of Conserved and Unique Subsequences - A first protein sequence associated with the organism is identified, wherein the first protein sequence comprises a plurality of ordered residues. A plurality of sub-sequences is generated based on the first protein sequence, wherein each sub-sequence comprises a plurality of contiguous residues and a starting residue number of each sub-sequence differs from a starting residue number of another sub-sequence by one position in the first protein sequence. A first unique sub-sequence comprising a first set of contiguous residues based on the plurality of sub-sequences is identified, wherein the first unique sub-sequence is specific to the organism and is identified based on a dataset of protein sequences and stored. | 05-26-2011 |
20110172929 | SYSTEM AND METHOD FOR PREDICTION OF PHENOTYPICALLY RELEVANT GENES AND PERTURBATION TARGETS - Disclosed herein is a systems biology approach to prediction of phenotypically relevant genes such as oncogenes and perturbation targets. Interactions from a comprehensive cellular network such as the B Cell Interactome (BCI) can be used to identify those that become affected, or dysregulated, by a phenotype (e.g, disease, tumor and cancer) or perturbation (e.g., drug treatment) based on correlation changes between expression profiles of gene pairs in the interactions upon removal or addition of samples showing the phenotype or perturbation. Genes can be ranked based on the affected interactions involving the genes to predict phenotypically relevant genes and/or perturbation targets. | 07-14-2011 |
20110172930 | DISCOVERY OF t-HOMOLOGY IN A SET OF SEQUENCES AND PRODUCTION OF LISTS OF t-HOMOLOGOUS SEQUENCES WITH PREDEFINED PROPERTIES - System(s) and method(s) for analysis and design of genome sequences are provided. A graph representation of a genome sequence facilitates generation of a thermodynamic based quantity, e.g., an entropy-based and enthalpy-based thermodynamic tolerance [τ], which in turn affords estimation of a gene sequence potential function that depends at least upon structural and functional properties of the gene sequence. The gene sequence potential (Φ) is determined, at least in part, via a generalized Schrödinger equation for the thermodynamic tolerance. Gene sequence potential and thermodynamic tolerance [τ], and derived quantities, like thermodynamic tolerance profile and generalized homology, provide an analytic instrument for characterization of natural and synthetic gene sequences, and in conjunction with graph-based algorithms embodies a tool for design of genome sequences with predetermined properties. | 07-14-2011 |
20110213563 | SYSTEM AND METHOD TO CORRECT OUT OF PHASE ERRORS IN DNA SEQUENCING DATA BY USE OF A RECURSIVE ALGORITHM - An embodiment of a method for correcting an error associated with phasic synchrony of sequence data generated from a population of template molecules is described that comprises the steps of detecting signals generated in response to nucleotide species introduced during a sequencing reaction; generating an observed value for the signal detected from each of the nucleotide species; defining positive incorporation values and negative incorporation values from the observed values using a carry forward value and an incomplete extension value; revising the carry forward value and the incomplete extension value using a noise value that is derived from observed values associated with the negative incorporation values; re-defining the positive incorporation values and the negative incorporation values using the revised carry forward value and the revised incomplete extension value; and repeating the steps of revising and re-defining until convergence of the positive incorporation values and the negative incorporation values | 09-01-2011 |
20110224916 | Fitness determination for DNA codeword searching - An apparatus for a hybrid architecture that consists of a general purpose microprocessor and a hardware accelerator for accelerating the discovery of DNA reverse complement, edit distance codes. Two embodiments are implemented and evaluated, including a code generator that uses a genetic algorithm (GA) to produce nearly locally optimal codes in a few minutes, and a code extender that uses exhaustive search to produce locally optimum codes in about 1.5 hours for the case of length 16 codes. Experimental results demonstrate that the GA embodiment can find ˜99% of the words in locally optimum libraries, and that the hybrid architecture embodiment provides more than 1000 times speed-up compared to a software only implementation. | 09-15-2011 |
20110246081 | Metabolomics-Based Identification of Disease-Causing Agents - A method, computer-readable medium, and system for identifying one or more metabolites associated with a disease, comprising: comparing gene expression data from diseased cells to gene expression data from control cells in order to deduce genes that are differentially-regulated in the diseased cells relative to the control cells; based on enzyme function and pathway data for all human metabolites that utilize the genes that are differentially-regulated in the disease cells, identifying one or more metabolites whose intracellular levels are higher or lower in diseased cells than in control cells, and thereby associating the one or more metabolites with the disease. | 10-06-2011 |
20110246082 | METHOD FOR SPECTRAL DNA ANALYSIS - The present invention relates a method for analyzing a DNA sequence. The DNA sequence by converting the DNA sequence into a plurality of binary indicator sequences (BIS), and applying short term Fourier transform (STFT) on the binary indicator sequences. A binning function (BF) is applied to the Fourier coefficients (Usk_X(k)) and thereby modifying the corresponding Fourier coefficients (Usk_X(k)). Finally, substantially equal modified Fourier coefficients (Usk_X(k)) is found. The invention provides the user with a much improved ability to see unique strong patterns in vast amount of DNA sequence data. | 10-06-2011 |
20110246083 | Noninvasive Diagnosis of Fetal Aneuploidy by Sequencing - Disclosed is a method to achieve digital quantification of DNA (i.e., counting differences between identical sequences) using direct shotgun sequencing followed by mapping to the chromosome of origin and enumeration of fragments per chromosome. The preferred method uses massively parallel sequencing, which can produce tens of millions of short sequence tags in a single run and enabling a sampling that can be statistically evaluated. By counting the number of sequence tags mapped to a predefined window in each chromosome, the over- or under-representation of any chromosome in maternal plasma DNA contributed by an aneuploid fetus can be detected. This method does not require the differentiation of fetal versus maternal DNA. The median count of autosomal values is used as a normalization constant to account for differences in total number of sequence tags is used for comparison between samples and between chromosomes. | 10-06-2011 |
20110246084 | METHODS AND SYSTEMS FOR ANALYSIS OF SEQUENCING DATA - The present technology relates to the methods and systems for analysis of sequencing data. In particular, methods and systems for characterizing a target nucleic acid while determining the nucleotide sequence of the target nucleic acid are described. Certain embodiments include methods and systems for identifying the source of a target nucleic acid by comparing the accumulating nucleotide sequence of a target nucleic acid to a population of reference nucleotide sequences. | 10-06-2011 |
20110246085 | System, method, and computer software for the presentation and storage of analysis results - A computer program product, and related systems and methods, are described that processes emission intensity data corresponding to probes of a biological probe array. The computer program includes a genotype and statistical analysis manager that determines absolute or relative expression values based, at least in part, on a statistical measure of the emission intensity data and at least one user-selectable statistical parameter. The analysis manager may also determine genotype calls for one or more probes based, at least in part, on the emission intensity data. The analysis manager may further display the absolute or relative expression values based, at least in part, on at least one user-selectable display parameter and/or a measure of normalized change between genotype calls. The measure of normalized change may be based, at least in part, on a comparison of genotype calls and a reference value. | 10-06-2011 |
20110251798 | Methods for high throughput genotyping - Methods for genotyping polymorphisms using allele specific probes are disclosed. A training set is used to generate a model for each polymorphism to be interrogated. The training set is used to obtain an estimate of the asymmetry between an intensity measurement for a first allele and an intensity measurement for a second allele of the same polymorphism. The intensity measurement obtained for a test sample is adjusted using the estimate of asymmetry prior to using the intensity measurements to make a genotyping call. In preferred embodiments the adjustment is applied to polymorphisms that have a likelihood of being heterozygous that is above a specified threshold. | 10-13-2011 |
20110257896 | Differential Filtering of Genetic Data - Computer software products, methods, and systems are described which provide functionality to a user conducting experiments designed to detect and/or identify genetic sequences and other characteristics of a genetic sample, such as, for instance, gene copy number and aberrations thereof. The presently described software allows the user to interact with a graphical user interface which depicts the genetic information obtained from the experiment. The presently disclosed methods and software are related to bioinformatics and biological data analysis. Specifically, provided are methods, computer software products and systems for analyzing and visually depicting genotyping data on a screen or other visual projection. The presently disclosed methods and software allow the user conducting the experiment to differentially filter complex genetic data and information by varying genetic parameters and removing or highlighting visually various regions of genetic data of interest (CytoRegions). These differential filters may be applied by the user to the entire set of genetic data and/or only to the specific CytoRegions of interest. | 10-20-2011 |
20110264379 | INVESTIGATIONS - A method of investigation a sample is provided, the sample being a mixture of DNA arising from more than one source. The method includes analysing the sample to obtain a genotype for the DNA present in the sample and assigning a prior probability distribution to the genotype. The likelihood function is considered and a posterior probability distribution for the genotype is established. In this way a probabilistic assessment of the genotype of the major or minor contributor to the sample can be obtained. This is beneficial over prior methods which use a deterministic method, and so involve the use of rule based methods. | 10-27-2011 |
20110270532 | Systems And Methods For Identifying Exon Junctions From Single Reads - Systems and methods are used to identify an exon junction from a single read of a transcript. A transcript sample is interrogated and a read sequence is produced using a nucleic acid sequencer. A first exon sequence and a second exon sequence are obtained using the processor. The first exon sequence is mapped to a prefix of the read sequence using the processor. The second exon sequence is mapped to a suffix of the read sequence using the processor. A sum of a number of sequence elements of the first exon sequence that overlap the prefix of the read sequence, of a number of sequence elements of the second exon sequence that overlap the suffix of the read sequence, and of a constant is calculated using the processor. If the sum equals a length of the read sequence, a junction is identified in the read using the processor. | 11-03-2011 |
20110270533 | SYSTEMS AND METHODS FOR ANALYZING NUCLEIC ACID SEQUENCES - Nucleic acid sequence mapping/assembly methods are disclosed. The methods initially map only a contiguous portion of each read to a reference sequence and then extends the mapping of the read at both ends of the mapped contiguous portion until the entire read is mapped (aligned). In various embodiments, a mapping score can be calculated for the read alignment using a scoring function, score (i, j)=M+mx, where M can be the number of matches in the extended alignment, x can be the number of mismatches in the alignment, and m can be a negative penalty for each mismatch. The mapping score can be utilized to rank or choose the best alignment for each read. | 11-03-2011 |
20110276277 | SIZE-BASED GENOMIC ANALYSIS - Systems, methods, and apparatuses for performing a prenatal diagnosis of a sequence imbalance are provided. A shift (e.g. to a smaller size distribution) can signify an imbalance in certain circumstances. For example, a size distribution of fragments of nucleic acids from an at-risk chromosome can be used to determine a fetal chromosomal aneuploidy. A size ranking of different chromosomes can be used to determine changes of a rank of an at-risk chromosome from an expected ranking. Also, a difference between a statistical size value for one chromosome can be compared to a statistical size value of another chromosome to identify a significant shift in size. A genotype and haplotype of the fetus may also be determined using a size distribution to determine whether a sequence imbalance occurs in a maternal sample relative to a genotypes or haplotype of the mother, thereby providing a genotype or haplotype of the fetus. | 11-10-2011 |
20110288785 | COMPRESSION OF GENOMIC BASE AND ANNOTATION DATA - A genomic data computer system receives a data set comprising sequenced genomic bases and associated annotations that form sequenced base-annotation pairs. The computer system determines a frequency distribution for the base-annotation pairs in the data set. The computer system determines variable-length identification codes for the base-annotation pairs based on the frequency distribution. The computer system converts the sequenced base-annotation pairs into a corresponding series of the variable-length identification codes that require a smaller amount of storage than the original data. | 11-24-2011 |
20110295519 | IDENTIFICATION OF RIBOSOMAL DNA SEQUENCES - Method(s) for identifying rDNA sequences from a sample containing plurality of unknown DNA sequences are described herein. The method includes selecting one or more target clusters, from a plurality of reference clusters, corresponding to the query sequence. The target clusters are selected based on a composition based analysis. A proportion of probable rDNA clusters from the target clusters is identified. Based on the proportion of the probable rDNA clusters, the query sequence is identified as an rDNA. | 12-01-2011 |
20110301862 | System for array-based DNA copy number and loss of heterozygosity analyses and reporting - The problem of analyzing, visualizing an interpreting data of DNA arrays (array CGH and SNP arrays) in a clinical setting becomes very important as DNA arrays take over clinical diagnostics. Reporting of detected chromosomal aberrations are complicated with data noise and presence of “normal” chromosomal variants that may occur even in healthy patients. Clinicians are facing interpretation of array data and detected chromosomal anomalies in patient samples every day. The disclosed system provides means for automated detection of chromosomal anomalies in individual samples. It also enables its users to interpret detected aberrations in an efficient manner so that clinically relevant anomalies get reported and aberrations that can occur in healthy patients get ignored. It also allows its users to accumulate and mine data from multiple human samples and re-use it in daily diagnostic operations to improve clinical interpretation of newly acquired samples. | 12-08-2011 |
20110301863 | PREDICTION METHOD FOR THE SCREENING, PROGNOSIS, DIAGNOSIS OR THERAPEUTIC RESPONSE OF PROSTATE CANCER, AND DEVICE FOR IMPLEMENTING SAID METHOD - The invention includes a prediction method for the screening or diagnosis or therapeutic management or prognosis of prostate cancer, including collecting individual input data and providing predictive information on the risk linked to a type of disease. The input data includes at least one variable or a combination of variables of the genetic type such as the identification of markers of genetic polymorphisms considered as being linked to the development of the disease. The invention also provides an individual prediction device for the screening or diagnosis or therapeutic management or prognosis of prostate cancer including first means for acquiring individual information data by a user, and at least a first software interface on which the said first means operate. The invention additionally includes a computer program product having the method and providing predictive information on risk linked to a disease. | 12-08-2011 |
20120010823 | System for the quantification of system-wide dynamics in complex networks - A device, method and system are provided for diagnosing a disease using a gene expression reader to analyze biological samples and output gene expression values to calculate a scaling factor using a computer by counting a number of link counts C | 01-12-2012 |
20120035860 | GC Wave Correction for Array-Based Comparative Genomic Hybridization - The present invention provides, among other things, new methods for optimizing comparative genomic hybridization (CGH) data analysis. In particular, the methods of the invention provide increased sensitivity and specificity due to the implemented individual chromosome-based GC-wave correction. In certain embodiments, the log ratios of probes derived from each chromosome are corrected based on the chromosome's GC content slope, and certain selected chromosomes undergo chromosomal median adjustment. As a result, the log ratios of the probes on the array are normalized to be closer to zero (0) for diploid regions and thus, the GC waves are substantially reduced, resulting in a reduced false positive rate. Systems, computer readable media, and kits for use in the optimized CGH methods also are provided. | 02-09-2012 |
20120041686 | DIAGNOSTIC FOR LUNG DISORDERS USING CLASS PREDICTION - The present invention provides methods for diagnosis and prognosis of lung cancer using expression analysis of one or more groups of genes, and a combination of expression analysis with bronchoscopy. The methods of the invention provide far superior detection accuracy for lung cancer when compared to any other currently available method for lung cancer diagnostic or prognosis. The invention also provides methods of diagnosis and prognosis of other lung diseases, particularly in individuals who are exposed to air pollutants, such as cigarette or cigar smoke, smog, asbestos and the like air contaminants or pollutants. | 02-16-2012 |
20120046877 | SYSTEMS AND METHODS TO DETECT COPY NUMBER VARIATION - In one aspect, a system for implementing a copy number variation analysis method, is disclosed. The system can include a nucleic acid sequencer and a computing device in communications with the nucleic acid sequencer. The nucleic acid sequencer can be configured to interrogate a sample to produce a nucleic acid sequence data file containing a plurality of nucleic acid sequence reads. In various embodiments, the computing device can be a workstation, mainframe computer, personal computer, mobile device, etc. | 02-23-2012 |
20120046878 | METHODS FOR ANALYZING HIGH DIMENSIONAL DATA FOR CLASSIFYING, DIAGNOSING, PROGNOSTICATING, AND/OR PREDICTING DISEASES AND OTHER BIOLOGICAL STATES - A method of diagnosing, predicting, or prognosticating about a disease that includes obtaining experimental data, wherein the experimental data is high dimensional data, filtering the data, reducing the dimensionality of the data through use of one or more methods, training a supervised pattern recognition method, ranking individual data points from the data, wherein the ranking is dependent on the outcome of the supervised pattern recognition method, choosing multiple data points from the data, wherein the choice is based on the relative ranking of the individual data points, and using the multiple data points to determine if an unknown set of experimental data indicates a diseased condition, a predilection for a diseased condition, or a prognosis about a diseased condition. | 02-23-2012 |
20120072124 | GENES ASSOCIATED WITH PROGRESSION AND RESPONSE IN CHRONIC MYELOID LEUKEMIA AND USES THEREOF - The invention provides molecular markers that are associated with the progression of chronic myeloid leukemia (CML), and methods and computer systems for monitoring the progression of CML in a patient based on measurements of these molecular markers. The present invention also provides CML target genes, and methods and compositions for treating CML patients by modulating the expression or activity of these CML target genes and/or their encoded proteins. The invention also provides genes that are associated with resistance to imatinib mesylate (Gleevec™) treatment in CML patients, and methods and compositions for determining the responsiveness of a CML patient to imatinib mesylate treatment based on measurements of these genes and/or their encoded proteins. The invention also provides methods and compositions for enhancing the effect of Gleevec™ by modulating the expression or activity of these genes and/or their encoded proteins. | 03-22-2012 |
20120078530 | METHOD FOR DETERMINING RECEPTOR-LIGAND PAIRS - The present invention provides a method of determining related proteins, the method comprising obtaining sequences of interest, wherein the sequences are amino acid sequences for proteins or nucleotide sequences encoding proteins; comparing segments of each sequence of interest with a database of amino acid or nucleotide sequences; generating a profile for each sequence of interest comprising a list of all sequences from the database of sequences that have segments corresponding to the segments of each sequence of interest; and comparing the database sequences appearing in the profile of each sequence of interest to the database sequences appearing in the profile of every other sequence of interest, wherein similar profiles indicate that the sequences of interest correspond to related proteins while dissimilar profiles indicate that the sequences of interest do not correspond to related proteins, wherein profiles are similar if there is at least a 30% overlap between the database sequences appearing in the profiles of the sequences of interest. | 03-29-2012 |
20120095696 | METHODS AND APPARATUS FOR GENETIC EVALUATION - Rapid and definitive bioagent detection and identification can be carried out without nucleic acid sequencing. Analysis of a variety of bioagents and samples, such as air, fluid, and body samples, can be carried out to provide information useful for industrial, medical, and environmental purposes. Nucleic acid samples of unknown or suspected bioagents may be collected, optimal primer pairs may be selected, and the nucleic acid may be amplified. Expected mass spectra signal models may be generated and selected, the actual mass spectra of the amplicons may be obtained. The expected mass spectra most closely correlating with the actual mass spectra may be determined using a joint maximum likelihood analysis, and base counts for the actual mass spectra and the expected mass spectra may be obtained. The most likely candidate bioagents may then be determined. | 04-19-2012 |
20120095697 | METHODS FOR ESTIMATING GENOME-WIDE COPY NUMBER VARIATIONS - Methods for determining the copy number of a genomic region at a detection position of a target sequence in a sample are disclosed. Genomic regions of a target sequence in a sample are sequenced and measurement data for sequence coverage is obtained. Sequence coverage bias is corrected and may be normalized against a baseline sample. Hidden Markov Model (HMM) segmentation, scoring, and output are performed, and in some embodiments population-based no-calling and identification of low-confidence regions may also be performed. A total copy number value and region-specific copy number value for a plurality of regions are then estimated. | 04-19-2012 |
20120101738 | Compositions and methods for biological remodeling with frozen particle compositions - Certain embodiments disclosed herein relate to compositions, methods, devices, systems, and products regarding frozen particles. In certain embodiments, the frozen particles include materials at low temperatures. In certain embodiments, the frozen particles provide vehicles for delivery of particular agents. In certain embodiments, the frozen particles are administered to at least one biological tissue. | 04-26-2012 |
20120116687 | SYSTEM AND METHOD FOR GENOTYPE ANALYSIS AND ENHANCED MONTE CARLO SIMULATION METHOD TO ESTIMATE MISCLASSIFICATION RATE IN AUTOMATED GENOTYPING - The present invention relates to methods and systems for the analysis of the dissociation behavior of nucleic acids. The present invention includes methods and systems for analyzing dynamic profiles of genotypes of nucleic acids, including the steps of using a computer, including a processor and a memory, to convert dynamic profiles of known genotypes of a nucleic acid to multi-dimensional data points, wherein the dynamic profiles each comprise measurements of a signal representing a physical change of a nucleic acid containing the known genotype relative to an independent variable; using the computer to reduce the multi-dimensional data points into reduced-dimensional data points; and generating a plot of the reduced-dimensional data points for each genotype. The present invention also relates to methods and systems for calculating error statistics for an assay for identifying a genotype in a biological sample using an enhanced Monte Carlo simulation method to generate a set of N random data points for each known genotype within a class of known genotypes, where each set of N random data points has the same mean data point and covariance matrix as a data set for each of the known genotypes. | 05-10-2012 |
20120116688 | METHOD, COMPUTER-ACCESSIBLE MEDIUM AND SYSTEM FOR BASE-CALLING AND ALIGNMENT - Exemplary methods, procedures, computer-accessible medium, and systems for base-calling, aligning and polymorphism detection and analysis using raw output from a sequencing platform can be provided. A set of raw outputs can be used to detect polymorphisms in an individual by obtaining a plurality of sequence read data from one or more technologies (e.g., using sequencing-by-synthesis, sequencing-by-ligation, sequencing-by-hybridization, Sanger sequencing, etc.). For example, provided herein are exemplary methods, procedures, computer-accessible medium and systems, which can include and/or be configured for obtaining raw output from a sequencing platform configured to be used for reading fragment(s) of genomes, obtaining reference sequences for the genomes obtained independently from the raw output, and generating a base-call interpretation and/or alignment using the raw output and the reference sequences. For example, a score function can be determined based on information associated with the sequencing platform that can be used to analyze polymorphisms based on the base-call interpretation and/or alignment. | 05-10-2012 |
20120123695 | METHODS FOR ASSESSING RESPONSIVENESS OF B-CELL LYMPHOMA TO TREATMENT WITH ANTI-CD40 ANTIBODIES - The invention provides methods and kits useful for predicting or assessing responsiveness of a patient having B-cell lymphoma to treatment with anti-CD40 antibodies. | 05-17-2012 |
20120173157 | METHOD AND APPARATUS FOR RECOVERING GENE SEQUENCE USING PROBE MAP - A method of recovering a nucleic acid sequence using a probe map includes: aligning a probe onto a target sequence based on a result in which the probe is hybridized to the target sequence; determining a representative value representing each aligned position of the probe; and recovering a base sequence of the target sequence by using a probe map to which the determined representative values and base sequence information of the probe are mapped. | 07-05-2012 |
20120173158 | TIME-WARPED BACKGROUND SIGNAL FOR SEQUENCING-BY-SYNTHESIS OPERATIONS - Methods for analyzing signal data generated by sequencing of a polynucleotide strand using a pH-based method of detecting nucleotide incorporation(s). In an embodiment, the method comprises formulating a function that models the output signal of a representative empty well of a reactor array. A time transformation is applied to the empty well function to obtain a time-warped empty well function. The time-warped empty well function is fitted to an output signal from the loaded well representative of a flow that results in a non-incorporation event in the loaded well. The fitted time-warped empty well function can then be used to analyze output signals from the loaded well for other flows. | 07-05-2012 |
20120173159 | METHODS, SYSTEMS, AND COMPUTER READABLE MEDIA FOR NUCLEIC ACID SEQUENCING - A method for nucleic acid sequencing includes receiving a plurality of signals indicative of a parameter measured for a plurality of defined spaces, at least some of the defined spaces including one or more sample nucleic acids, the signals being responsive to a plurality of nucleotide flows introducing nucleotides to the defined spaces; determining, for at least some of the defined spaces, whether the defined space includes a sample nucleic acid; processing, for at least some of the defined spaces determined to include a sample nucleic acid, the received signals to improve a quality of the received signals; and predicting a plurality of nucleotide sequences corresponding to respective sample nucleic acids for the defined spaces based on the processed signals and the nucleotide flows. | 07-05-2012 |
20120185177 | HARNESSING HIGH THROUGHPUT SEQUENCING FOR MULTIPLEXED SPECIMEN ANALYSIS - A method of associating a DNA sequence with a specimen pooled among a plurality of specimens, where the specimens may be pooled according to any number of pooling schemes, including the Chinese Remainder Theorem, random pool selection, shifted transversal design, and Chinese Remainder Sieve. A unique identifier is associated with each specimen according to the pooling scheme such that a decoder may associate a DNA sequence with each specimen after next-generation sequencing according to the unique identifier and the chosen pooling scheme. | 07-19-2012 |
20120185178 | Methods and Computer Software for Detecting Splice Variants - Methods and software products for analysis of alternative splicing are disclosed. In general the methods involve normalizing probe set or exon intensity to an expression level measurement of the gene. The methods may be used to identify tissue-specific alternative splicing events. | 07-19-2012 |
20120191366 | Methods and Apparatus for Assigning a Meaningful Numeric Value to Genomic Variants, and Searching and Assessing Same - The present invention relates to methods, apparatus and computer systems for assigning a numerical value to a genotype at a single- or multi-base segment in an individual's genome to denote the presence of a match or a mismatch of a nucleic acid base sequence of one or more chromosomal copies of the segment, as compared to the nucleic acid base sequence at a reference genome segment that corresponds to the segment of the individual's genome. The methods involve assigning a single digit numerical value to the match or the mismatch of each chromosomal copy of the segment in the genome, so that the numerical value assigned to a mismatch is greater than the numerical value of the match. A null symbol is assigned to a no call determination. The assigned numerical values are summed and a total numerical value which is a single digit or a fixed number of digits is obtained. The steps are repeated to create a vector of total numerical values for the segment among the set of genomes, to thereby obtain a segment-specific pattern of genotype match/mismatch between a set of genomes and the nucleic acid base sequence at the reference genome segment. The segment-specific pattern, also referred to as a “diff pattern” can be used to filter or uncover specific trends or sub-patterns across a set of genomes, and more quickly identify genotypic/phenotypic relationships by identifying sites where the distribution of genotypes in the set of genomes relates in a distinctive, causal way to the distribution of a given phenotype among the individuals whose genomes are under study. | 07-26-2012 |
20120197540 | MARKERS TO PREDICT SURVIVAL OF BREAST CANCER PATIENTS AND USES THEREOF - The present invention relates to a method to predict the mortality risk of a subject (p) affected of breast cancer comprising measuring the expression level of 105 specific genes in a biological sample, obtaining the prognostic score, S(p), that indicates the expression levels of said genes in said subject (p) affected of cancer, and predicting the mortality risk of said subject (p) affected of cancer. | 08-02-2012 |
20120215463 | Rapid Genomic Sequence Homology Assessment Scheme Based on Combinatorial-Analytic Concepts - The present invention relates to methods and apparatus for rapid assessment of genomic sequences using the difference set model. The invention provides methods to determine the presence and identity of similarities and differences in genomic sequences. In particular, the invention provides methods and apparatus to assess homology, the presence and identity of insertion and deletion segments and the presence and identity of single nucleotide polymorphisms in genomic sequences. | 08-23-2012 |
20120221255 | System, Method, and Computer Software Product for Genotype Determination Using Probe Array Data - An embodiment of a method of analyzing data from processed images of biological probe arrays is described that comprises receiving a plurality of files comprising a plurality of intensity values associated with a probe on a biological probe array; normalizing the intensity values in each of the data files; determining an initial assignment for a plurality of genotypes using one or more of the intensity values from each file for each assignment; estimating a distribution of cluster centers using the plurality of initial assignments; combining the normalized intensity values with the cluster centers to determine a posterior estimate for each cluster center; and assigning a plurality of genotype calls using a distance of the one or more intensity values from the posterior estimate. | 08-30-2012 |
20120232805 | Computerized Amino Acid Composition Enumeration - A computerized method and apparatus for enumerating one or more amino acid compositions is disclosed that provides one or more processors, a data storage communicably coupled to the one or more processors and a user interface communicably coupled to the one or more processors. The three or more user-specified characteristics are received from the data storage or the user interface. The one or more amino acid compositions are enumerated for all the peptides having a length less than or equal to the maximum length and a mass less than or equal to the mass limit using the one or more processors. The enumerated amino acid compositions are filtered based on the one or more other user-specified characteristics using the one or more processors. The filtered amino acid compositions and the mass of the filtered amino acid compositions are stored in the data storage. | 09-13-2012 |
20120253689 | AB INITIO GENERATION OF SINGLE COPY GENOMIC PROBES - Single copy sequences suitable for use as DNA probes can be defined by computational analysis of genomic sequences. The present invention provides an ab initio method for identification of single copy sequences for use as probes which obviates the need to compare genomic sequences with existing catalogs of repetitive sequences. By dividing a target reference sequence into a series of shorter contiguous sequence windows and comparing these sequences with the reference genome sequence, one can identify single copy sequences in a genome. Probes can then be designed and produced from these single copy intervals. | 10-04-2012 |
20120259556 | SYSTEM AND METHODS FOR INDEL IDENTIFICATION USING SHORT READ SEQUENCING - Systems, methods, and analytical approaches for short read sequence assembly and for the detection of insertions and deletions (indels) in a reference genome. A method suitable for software implementation is presented in which indels may be readily identified in a computationally efficient manner. | 10-11-2012 |
20120283958 | GENERALIZED NETWORK THREADING APPROACH FOR PREDICTING A SUBJECT'S RESPONSE TO HEPATITIS C VIRUS THERAPY - Methods for predicting a response of a virus to an antiviral therapy are provided. | 11-08-2012 |
20120303286 | SYSTEM FOR ANIMAL HEALTH DIAGNOSIS - A diagnosis of the health of an animal is obtained through a combination of computerized data and human interpretation. Data relates to the physical characteristics of the animal, and includes data obtained from a physical inspection of the animal. A blood or other fluid sample is used to obtain a computer generated laboratory analysis. This is reported through an internet network to the clinical pathologist. The clinical pathologist has the data relating to the physical characteristics, and thereby makes a diagnosis of the animal health. A drop-down menu on a computer screen provides supplemental reports to support the diagnosis. This can be enhanced by further input from the pathologist through keyboard entry into the computer to obtain an integrated computer report having the laboratory analysis, supplemental report, and selectively an enhanced report. The integrated report is electronically communicated to a client. | 11-29-2012 |
20120303287 | METHODS AND SYSTEMS FOR CONSERVATIVE EXTRACTION OF OVER-REPRESENTED EXTENSIBLE MOTIFS - Methods and systems of extracting extensible motifs from a sequence include assigning a significance to extensible motifs within the sequence based upon a syntactic and statistical analysis, and identifying extensible motifs having a significance that exceeds a predetermined threshold. | 11-29-2012 |
20120310543 | System and Method to Improve Sequencing Accuracy of a Polymer - The sequencing of individual monomers (e.g., a single nucleotide) of a polymer (e.g., DNA, RNA) is improved by reducing the motion of the polymer due to thermally-driven diffusion to reduce the spatial error in the position of the polymer within a measurement device. A major system parameter, such as average translocation velocity or measurement time, is selected based on the characteristics of the sensing system utilized, and an algorithm jointly optimizes the sequencing order error rate and the monomer identification error rate of the system. | 12-06-2012 |
20120310544 | SYSTEMS AND METHODS FOR IDENTIFYING STRUCTURALLY OR FUNCTIONALLY SIGNIFICANT AMINO ACID SEQUENCES - Methods and computer readable storage mediums for identifying structurally or functionally significant amino acid sequences encoded by a genome are disclosed. At least one structurally or functionally significant amino acid sequence encoded by a genome may be identified by compiling an observed frequency for each of a plurality of amino acid words encoded by the genome, calculating with a computer an expected frequency for each of the plurality of amino acid words encoded by the genome, and identifying at least one structurally or functionally significant amino acid sequence encoded by the genome based at least in part on the observed and expected frequencies for each of the plurality of amino acid words encoded by the genome. | 12-06-2012 |
20120323498 | MIRFILTER: EFFICIENT NOISE REDUCTION METHOD TO IDENTIFY MIRNA AND TARGET GENE NETWORKS FROM GENOME-WIDE EXPRESSION DATA - A computer implemented method of identifying potential micoRNA targets and biomarkers comprises receiving data identifying a first set of mRNA sequences into computer accessible memory. Each mRNA sequence in the first set has a region that is upstream of a translation start site, a region that is downstream of a translation stop site, and an open reading frame. The method further comprises receiving data identifying a second set of microRNA (miRNA) sequences into the computer accessible memory. Each microRNA sequence has a 5′ miRNA section and a 3′ miRNA section. Each mRNA sequence is characterized by an expression pattern in the first set as being up-regulated, down-regulated, or uncharged as compared to a control sample and each miRNA sequence in the second set as being up-regulated, down-regulated, or uncharged as compared to the control sample. It is then determined which mRNA sequences from the first set are susceptible to being regulated by microRNA from the second set. A set of consistent relationships is identified between the miRNA and the mRNA determined from the mRNAs that have been characterized. | 12-20-2012 |
20120330566 | SEQUENCE ASSEMBLY AND CONSENSUS SEQUENCE DETERMINATION - Computer implemented methods, and systems performing such methods for processing signal data from analytical operations and systems, and particularly in processing signal data from sequence-by-incorporation processes to identify nucleotide sequences of template nucleic acids and larger nucleic acid molecules, e.g., genomes or fragments thereof. In particularly preferred embodiments, nucleic acid sequences generated by such methods are subjected to de novo assembly and/or consensus sequence determination. | 12-27-2012 |
20120330567 | METHODS AND SYSTEMS FOR DATA ANALYSIS - The present disclosure provides computer implemented methods and systems for analyzing datasets, such as large data sets output from nucleic acid sequenceing technologies. In particular, the present disclosure provides for data analysis comprising computing the BWT of a collection of strings in an incremental, character by character, manner. The present disclosure also provides compression boosting strategies resulting in a BWT of a reordered collection of data that is more compressible by second stage compression methods compared to non-reordered computational analysis. | 12-27-2012 |
20130013220 | METHOD AND APPARATUS FOR ANALYZING DNA - The distribution and/or ratio of Thymine, Cytosine, Adenine and Guanine of a DNA sequence from a target organism are organized and analyzed. The result is then used to determine the possible impacts the target organism may have in a host such as a human body. The corresponding treatment and prevention strategies may also be determined. The goal is to provide an effective way to diagnose, treat and prevent diseases such as infectious diseases, to test the safety of food and drugs, and therefore to create natural and effective solutions for health care and food supply. For example, a DNA analysis method configured according to the invention receives a DNA sequence input and converts it into a reassembled sequence. A result based on the reassembled sequence may then be output. Determination of the analysis result, treatment and prevention strategies may also be output. | 01-10-2013 |
20130030714 | METHODS FOR THE SURVEY AND GENETIC ANALYSIS OF POPULATIONS - The present invention relates to methods for performing surveys of the genetic diversity of a population. The invention also relates to methods for performing genetic analyses of a population. The invention further relates to methods for the creation of databases comprising the survey information and the databases created by these methods. The invention also relates to methods for analyzing the information to correlate the presence of nucleic acid markers with desired parameters in a sample. These methods have application in the fields of geochemical exploration, agriculture, bioremediation, environmental analysis, clinical microbiology, forensic science and medicine. | 01-31-2013 |
20130041593 | Method for fast and accurate alignment of sequences - Genomic sequence matching and alignment techniques are disclosed. In one embodiment of the invention, computerized methods are provided for analyzing sequence similarity data obtained by means of a table of all local hits recorded between query sequence and reference index. The table of local hits represents all occurrences of query subsequences in reference index that stored all transitions between single l-mer prefix to multiple m-mer suffixes. The index data structure may take a variety of forms, including an array or a tree. The base position of each transition from l-prefix to m-suffix is recorded in k-bit masked form. The positions data structure may take a variety of forms as well, including an array or a tree. The table of local hits derived from l-prefix, m-suffix and k-position reference index is used by a series of low time and space complexity algorithms for optimizing alignment between query and reference. | 02-14-2013 |
20130041594 | AUTOMATED DECISION SUPPORT FOR ASSOCIATING AN UNKOWN BIOLOGICAL SPECIMEN WITH A FAMILY - Three methods of predicting whether an unknown biological specimen of a missing person originates from a member of a particular family comprise an initial automated decision support (ADS) algorithm for determining a list of relatives of the missing person for DNA typing and which typing technologies of available technologies to use for a listed relative. The ADS algorithm may be implemented on computer apparatus including a processor and an associated memory. The ADS method comprises determining a set of relatives of available family member relatives for DNA typing via a processor from a stored list of family member relatives according to one of a rule base, a table of hierarchically stored relatives developed based on discriminatory power or by calculating the discriminatory power for available family relatives to type. The ADS method may further comprise comparing at least one set of DNA typing data for the unknown biological specimen to DNA typing data from biological specimens from the determined set of relatives; calculating by the processor a likelihood function that the person is related to the family; and outputting a decision whether or not the person is related to the family. | 02-14-2013 |
20130054151 | PHASING OF HETEROZYGOUS LOCI TO DETERMINE GENOMIC HAPLOTYPES - Haplotypes of one or more portions of a chromosome of an organism from sequencing information of DNA or RNA fragments can be determined. Heterozygous loci (hets) can be used to determine haplotypes. One allele on a first het can be connected (likely to be on the same haplotype) to an allele on a second het, thereby defining a particular orientation between the hets. Haplotypes can be assembled through these connections. Errors can be identified through redundant connection information, particularly using a confidence value (strength) for a particular connection. The connections among a set of hets can be analyzed to determine likely haplotypes for that set, e.g., an optimal tree of a graph containing the hets. Furthermore, haplotypes of different contiguous sections (contig) of the chromosome can be matched to a particular chromosome copy (e.g., to a particular parental copy). Thus, the phase of an entire chromosome can be determined. | 02-28-2013 |
20130060481 | Systems and Methods for Identifying Structurally or Functionally Significant Nucleotide Sequences - Provided are methods, systems, and computer readable media for comparing word statistics between a significant amino acid sequence and a significant nucleotide sequence wherein the comparison instructs further research. | 03-07-2013 |
20130060482 | METHODS, SYSTEMS, AND COMPUTER READABLE MEDIA FOR MAKING BASE CALLS IN NUCLEIC ACID SEQUENCING - A method for nucleic acid sequencing includes receiving a plurality of observed or measured signals indicative of a parameter observed or measured for a plurality of defined spaces; determining, for at least some of the defined spaces, whether the defined space comprises one or more sample nucleic acids; processing, for at least some of the defined spaces, the observed or measured signal to improve a quality of the observed or measured signal; generating, for at least some of the defined spaces, a set of candidate sequences of bases for the defined space using one or more metrics adapted to associate a score or penalty to the candidate sequences of bases; and selecting the candidate sequence leading to a highest score or a lowest penalty as corresponding to the correct sequence for the one or more sample nucleic acids in the defined space. | 03-07-2013 |
20130060483 | DETERMINATION OF COPY NUMBER VARIATIONS USING BINOMIAL PROBABILITY CALCULATIONS - This invention relates to a binomial calculation of copy number of data obtained from a mixed sample having a first source and a second source. | 03-07-2013 |
20130073216 | METHOD, SYSTEM AND APPARATUS TO PREDICT AND/OR RECOGNIZE AND/OR CLASSIFY BIOLOGICAL SEQUENCES - A method, a system and an apparatus for predicting and/or recognizing and/or classifying biological sequences, specially sequence families with binding site recognition motifs poorly conserved, comprising, advantageously, the use of neural networks rules; providing enhanced and more accurate results; and is preferably used when the biological sequence is a promoter. | 03-21-2013 |
20130073217 | Phased Whole Genome Genetic Risk In A Family Quartet - In an embodiment of the present invention, three novel human reference genome sequences were developed based on the most common population-specific DNA sequence (“major allele”). Methods were developed for their integration into interpretation pipelines for highthroughput whole genome sequencing. | 03-21-2013 |
20130090860 | METHODS, SYSTEMS, AND COMPUTER READABLE MEDIA FOR MAKING BASE CALLS IN NUCLEIC ACID SEQUENCING - A method for nucleic acid sequencing includes: receiving a signal comprising measurements of a parameter measured in response to a plurality of nucleotide flows flowed in a space comprising a sample nucleic acid; normalizing the signal to obtain a normalized signal; adaptively normalizing the normalized signal to obtain an adaptively normalized signal; and predicting a sequence of base calls corresponding to the sample nucleic acid using the adaptively normalized signal. | 04-11-2013 |
20130090861 | INTERNAL SIZING/LANE STANDARD SIGNAL VERIFICATION - A method for verifying an ILS signal for DNA processing includes obtaining the ILS signal, determining acquisition times between peaks of the ILS signal, obtaining acquisition times between peaks in reference ILS information for the ILS signal, and verifying the ILS signal based on the ILS acquisition times and the reference ILS acquisitions times. An ILS signal processor ( | 04-11-2013 |
20130096845 | CUMULATIVE DIFFERENTIAL CHEMICAL ASSAY IDENTIFICATION - An apparatus comprising: a value receiver, configured to receive fluorescence values measured during a chemical reaction involving a test sample, each value pertaining to a respective physical parameter value, a difference calculator, configured to calculate differences, each difference being between respective one of the measured fluorescence values and one of reference fluorescence values of a reference sample, each reference fluorescence value pertaining to a respective physical parameter value, a cumulative index calculator, configured to calculate a cumulative index, by selecting a first difference among the calculated differences, and selecting and adding to the first difference differences, each one of the added differences being selected according to a proximity standard applied on each two differences selected in a sequence, the proximity standard being based on proximity of physical parameter values and difference size, and a similarity determiner, configured to determine similarity between the samples, using the calculated cumulative index. | 04-18-2013 |
20130103322 | Method and System for Analyzing Mass Spectrometry Data - Provided is a technique for accurately identifying a peptide even if the peptide cannot be identified by MS/MS ion search or de novo sequencing. The technique uses an MS | 04-25-2013 |
20130110410 | APPARATUS AND METHOD FOR GENERATING NOVEL SEQUENCE IN TARGET GENOME SEQUENCE | 05-02-2013 |
20130124100 | Processing and Analysis of Complex Nucleic Acid Sequence Data - The present invention is directed to logic for analysis of nucleic acid sequence data that employs algorithms that lead to a substantial improvement in sequence accuracy and that can be used to phase sequence variations, e.g., in connection with the use of the long fragment read (LFR) process. | 05-16-2013 |
20130131995 | System And Method to Correct Out of Phase Errors In DNA Sequencing Data by Use of a Recursive Algorithm - An embodiment of a method for correcting an error associated with phasic synchrony of sequence data generated from a population of template molecules is described that comprises the steps of detecting signals generated in response to nucleotide species introduced during a sequencing reaction; generating an observed value for the signal detected from each of the nucleotide species; defining positive incorporation values and negative incorporation values from the observed values using a carry forward value and an incomplete extension value; revising the carry forward value and the incomplete extension value using a noise value that is derived from observed values associated with the negative incorporation values; re-defining the positive incorporation values and the negative incorporation values using the revised carry forward value and the revised incomplete extension value; and repeating the steps of revising and re-defining until convergence of the positive incorporation values and the negative incorporation values | 05-23-2013 |
20130138358 | ALGORITHMS FOR SEQUENCE DETERMINATION - The present invention is generally directed to powerful and flexible methods and systems for consensus sequence determination from replicate biomolecule sequence data. It is an object of the present invention to improve the accuracy of consensus biomolecule sequence determination from replicate sequence data by providing methods for assimilating replicate sequence into a final consensus sequence more accurately than any one-pass sequence analysis system. | 05-30-2013 |
20130144540 | CONSTRAINED DE NOVO SEQUENCING OF PEPTIDES - A peptide sequencing system derives a peptide sequence from a mass spectrum. The system can receive a description for a peptide sequence constraint, such that the constraint indicates a symbol pattern that is to be present in a peptide sequence derived from the mass spectrum. Then, the system generates a peptide sequence based on the mass spectrum and the constraint, such that the peptide sequence matches the constraint and has a mass that matches the total mass of the peptide as determined from the mass spectrum. | 06-06-2013 |
20130144541 | MASS SPECTROMETRY-CLEAVABLE CROSS-LINKING AGENTS TO FACILITATE STRUCTURAL ANALYSIS OF PROTEINS AND PROTEIN COMPLEXES, AND METHOD OF USING SAME - Novel cross-linking compounds that can be used in mass spectrometry, tandem mass spectrometry, and multi-stage tandem mass spectrometry to facilitate structural analysis of proteins and protein complexes are provided and have the formula: | 06-06-2013 |
20130158884 | METHOD FOR IDENTIFYING NUCLEOTIDE SEQUENCE, METHOD FOR ACQUIRING SECONDARY STRUCTURE OF NUCLEIC ACID MOLECULE, APPARATUS FOR IDENTIFYING NUCLEOTIDE SEQUENCE, APPARATUS FOR ACQUIRING SECONDARY STRUCTURE OF NUCLEIC ACID MOLECULE, PROGRAM FOR IDENTIFYING NUCLEOTIDE SEQUENCE, AND PROGRAM FOR ACQUIRING SECONDARY STRUCTURE OF NUCLEIC ACID MOLECULE - The object of the present invention is to provide a method for identifying a nucleotide sequence necessary for expressing affinity for a target substance with respect to a nucleotide sequence of a nucleic acid molecule such as an aptamer having such affinity for the target substance, based on similarity between nucleotide sequences and an evaluated value of the affinity of the nucleotide sequence, and a method for predicting a secondary structure of the nucleic acid molecule including the identified nucleotide sequence. The method of present invention includes the steps of extracting a single-stranded region by excluding based capable of forming a stem structure from the nucleotide sequence of the nucleic acid molecule; and searching a motif sequence from the single-stranded region, based on an evaluated value of the affinity. | 06-20-2013 |
20130158885 | GENOME SEQUENCE MAPPING DEVICE AND GENOME SEQUENCE MAPPING METHOD THEREOF - Provided are a genome sequence mapping device and a genome sequence mapping method. The genome sequence mapping device may include a controller and a genome sequence analyzer configured to map target sequence data to reference sequence data. The genome sequence analyzer transforms the reference sequence data and the target sequence data into frequency domains to determine a position of the target sequence data to be mapped among the reference sequence data. The genome sequence mapping device calculates a correlation between reference sequence data and target sequence data in a frequency domain to immediately determine whether the reference sequence data and the target sequence data match each other. | 06-20-2013 |
20130166221 | METHOD AND SYSTEM FOR SEQUENCE CORRELATION - A method and system are provided for evaluating the correlation between sequences by entering segments of one sequence in a database and comparing segments of the other sequence with the index values to find correlated segments. The correlated segments are analysed to determine whether the spacing is within a defined range indicating that a correlation threshold has been met. A processing methodology may be employed whereby a coarse potential alignment algorithm is first applied to determine potential alignment at a plurality of potential alignment positions, which are filtered based on alignment scores, and a fine alignment algorithm is then applied. | 06-27-2013 |
20130173177 | NUCLEIC ACID SEQUENCE ANALYSIS - This document provides materials and methods involved in nucleic acid sequence analysis. For example, methods and materials for distinguishing sequencing errors (e.g., sequencing and/or PCR artifacts) from true polymorphic sequence variations (e.g., single-nucleotide polymorphisms, sequence insertions, sequence deletions, or combinations thereof) are provided. In addition, methods and materials for determining homozygosity or heterozygosity are provided. | 07-04-2013 |
20130204537 | Amino Acid Sequence Analyzing Method and Amino Acid Sequence Analyzing Apparatus - The amino acid sequence is deduced by using de novo sequencing, to prevent the correct amino acid sequence from not being ranked high as candidates. Amino acid sequence candidates are computed by finding the longest path by a branch and using a bound method based on the spectrum data on the target peptide and the known amino acid sequence. A tree-structured directed graph is used where amino acid sequences are set as nodes and the peak intensities corresponding to the amino acids are set as branches. In a sequence put at a node in the highest layer, an amino acid is placed at a terminal, and as the layer goes deeper, amino acids are sequentially placed from both terminals toward the center of the sequence. The final score is estimated based on the remaining amino acids, and if the score is small, the search is halted. | 08-08-2013 |
20130245961 | METHODS FOR ANALYZING MASSIVELY PARALLEL SEQUENCING DATA FOR NONINVASIVE PRENATAL DIAGNOSIS - This invention provides several ways of managing GC bias that occurs during seequencing and analysis of genomic DNA. Maternal plasma can be used as a source of fetal DNA for analysis. DNA segments or tags obtained from the plasma can be aligned with a chromosomal region of interest and with an artificial reference chromosome assembled from regions of the genome having matching GC content. This technology can be used, for example, to detect and evaluate aneuploidy and other chromosomal abnormalities | 09-19-2013 |
20130245962 | ALGORITHMS FOR CLASSIFICATION OF DISEASE SUBTYPES AND FOR PROGNOSIS WITH GENE EXPRESSION PROFILING - Methods for generating a normalized expression signal for microarray data based on a theoretical distribution at the unit level to produce a normalized expression signal for the single microarray that is independent of other microarrays. The method typically includes receiving microarray data representing a plurality of probe pairs for a single microarray, determining, for each probe pair, differences between intensities of perfect match (PM) probes and intensities of mismatched (MM) probes, determining a difference signal, D, based on the determined differences, and scaling the difference signal, D, to produce an expression signal, DS. The method also typically includes normalizing the expression signal based on a theoretical distribution at the unit level to produce a normalized expression signal for the single microarray that is independent of other microarrays. | 09-19-2013 |
20130253847 | GENETIC VARIANTS AS MARKERS FOR USE IN DIAGNOSIS, PROGNOSIS AND TREATMENT OF EOSINOPHILIA, ASTHMA, AND MYOCARDIAL INFARCTION - Polymorphic variants (e.g., certain alleles of polymorphic markers) that have been found to be associated with high blood eosinophil counts, conditions causative of eosinophilia (e.g., asthma, myocardial infarction), and/or hypertension are provided herein. Such polymorphic markers are useful for diagnostic purposes, such as in methods of determining a susceptility, and for prognostic purposes, including methods of predicting prognosis and methods of assessing an individual for probability of a response to a therapeutic agent, as further described herein. Further applications utilize the polymorphic markers of the invention include, screening methods and genotyping methods. The invention furthermore provides related kits, computer-readable medium, and apparatus. | 09-26-2013 |
20130261984 | METHODS AND SYSTEMS FOR DETERMINING FETAL CHROMOSOMAL ABNORMALITIES - The present disclosure provides methods and systems for determining the presence or absence of aneuploidy in a fetus. In particular, the present disclosure provides noninvasive methods and systems for detecting the presence of fetal trisomy and other fetal chromosomal anomalies, paternity of a fetus and fetal genotype. | 10-03-2013 |
20130268206 | SEQUENCE ASSEMBLY - The invention relates to assembly of sequence reads. The invention provides a method for identifying a mutation in a nucleic acid involving sequencing nucleic acid to generate a plurality of sequence reads. Reads are assembled to form a contig, which is aligned to a reference. Individual reads are aligned to the contig. Mutations are identified based on the alignments to the reference and to the contig. | 10-10-2013 |
20130268207 | SYSTEMS AND METHODS FOR IDENTIFYING SOMATIC MUTATIONS - Systems and method for identifying somatic mutations can receive first ans second sequence information, determine if a variant present in the first sequencing information is also present in the second sequence information, and identify variants present in the first sequence information are somatic mutations when the variant is either not present in the second sequence information or the presence of the variant in the second sequence information is likely due to a sequencing error. | 10-10-2013 |
20130289890 | Rank Normalization for Differential Expression Analysis of Transcriptome Sequencing Data - A computer-implemented method for rank normalization for differential expression analysis of transcriptome sequencing data includes receiving, by a computer, a first dataset comprising transcriptome sequencing data, the first dataset comprising a plurality of genes, and further comprising a respective ranking value associated with each of the plurality of genes; assigning a rank to each of the genes of the plurality of genes based on the ranking value to produce a first rank normalized dataset; determining a change between a first rank of a particular gene in the first rank normalized dataset, and a second rank of the particular gene in a second rank normalized dataset, the second rank normalized dataset being based on a second dataset comprising transcriptome sequencing data; and determining whether the particular gene is differentially expressed between the first dataset and the second dataset based on the determined change in rank. | 10-31-2013 |
20130289891 | Rank Normalization for Differential Expression Analysis of Transcriptome Sequencing Data - A computer system for rank normalization for differential expression analysis of transcriptome sequencing data includes a processor; and a memory comprising a first dataset comprising transcriptome sequencing data, the first dataset comprising a plurality of genes and a respective ranking value associated with each of the plurality of genes, the system configured to perform a method including assigning a rank to each of the genes of the plurality of genes based on the ranking value to produce a first rank normalized dataset; determining a change between a first rank of a particular gene in the first rank normalized dataset, and a second rank of the particular gene in a second rank normalized dataset, the second rank normalized dataset being based on a second dataset comprising transcriptome sequencing data; and determining whether the particular gene is differentially expressed between the first and second datasets based on the determined change in rank. | 10-31-2013 |
20130297221 | Method and System for Accurate Construction Of Long Range Haplotype - In an embodiment of the present invention, a modified version of the PHASE model is implemented that is substantially more accurate than the FastPHASE model. Modifications in an embodiment of the present invention include using a parameterization EM algorithm similar to that of the FastPHASE model, and to perform optimization on haplotypes rather than MCMC sampling. In an embodiment, the imputed haplotypes themselves are used as hidden states in the HMM because this is believed to be important for the PHASE model's accuracy. This increase in accuracy becomes more pronounced with increasing sample size. This difference is attributed to the PHASE model's likelihood which produces long, shared haplotypes between pairs of individuals. | 11-07-2013 |
20130304391 | TRANSMISSION AND COMPRESSION OF GENETIC DATA - A method, computer product and computer system of transmitting a compressed genome of an organism: a computer at a source reading an uncompressed sequence and a reference genome from a repository; the computer comparing nucleotides of the genetic sequence of the organism to nucleotides from a reference genome, to find differences where nucleotides of the genetic sequence of the organism which are different from the nucleotides of the reference genome; the computer using the differences to create surprisal data, the surprisal data comprising a starting location of the differences within the reference genome, and the nucleotides from the genetic sequence of the organism which are different from the nucleotides of the reference genome; and the computer transmitting, to a destination, a compressed genome comprising: surprisal data and an indication of the reference genome, discarding sequences of nucleotides that are the same in the sequence of the organism and reference genome. | 11-14-2013 |
20130304392 | METHODS AND PROCESSES FOR NON-INVASIVE ASSESSMENT OF GENETIC VARIATIONS - Provided herein are methods, processes and apparatuses for non-invasive assessment of genetic variations. | 11-14-2013 |
20130311105 | System And Method For Generation And Use Of Optimal Nucleotide Flow Orders - An embodiment of a method for generating a flow order that minimizes the accumulation of phasic synchrony error in sequence data is described that comprises the steps of: (a) generating a plurality of sequential orderings of nucleotides species comprising a k-base length, wherein the sequential orderings define a sequence of introduction of nucleotide species into a sequencing by synthesis reaction environment; (b) simulating acquisition of sequence data from one or more reference genomes using the sequential orderings, wherein the sequence data comprises an accumulation of phasic synchrony error; and (c) selecting one or more of the sequential orderings using a read length parameter and an extension rate parameter. | 11-21-2013 |
20130311106 | Comprehensive Analysis Pipeline for Discovery of Human Genetic Variation - Systems and methods for analyzing genetic sequence data involve: (a) obtaining, by a computer system, genetic sequencing data pertaining to a subject; (b) splitting the genetic sequencing data into a plurality of segments; (c) processing the genetic sequencing data such that intra-segment reads, read pairs with both mates mapped to the same data set, are saved to a respective plurality of individual binary alignment map (BAM) files corresponding to that respective segment; (d) processing the genetic sequencing data such that inter-segment reads, read pairs with both mates mapped to different segments, are saved into at least a second BAM file; and (e) processing at least the first plurality of BAM files along parallel processing paths. The plurality of segments may correspond to any given number of genomic subregions and may be selected based upon the number of processing cores used in the parallel processing. | 11-21-2013 |
20130311107 | PROCESSES FOR CALCULATING PHASED FETAL GENOMIC SEQUENCES - The present invention provides processes for calculating phased genomic sequences of the fetal genome using fetal DNA obtained from a maternal sample. The processes and systems of the present invention utilize novel technological and computational approaches to detect fetal genomic sequences and determine the phased heritable genomic sequences. The invention could be used, e.g., to identify in utero deleterious mutations carried by the parents and inherited by a fetus within a particular heritable genomic region. | 11-21-2013 |
20130317755 | Methods, computer-accessible medium, and systems for score-driven whole-genome shotgun sequence assembly - Exemplary systems, methods and computer-accessible mediums for assembling at least one haplotype or genotype sequence of at least one genome can be provided, which can include, obtaining a plurality of randomly located sequence reads, incrementally generating overlap relations between the randomly located sequence reads using a plurality of overlapper procedures, and generating a layout of some of the randomly located short sequence reads based on a function in combination with constraints based on information associated with the one genome while substantially satisfying the constraints. The score-function can be derived from overlap relations between the randomly located short sequence reads. A search can be performed together with score- and constraint-dependent pruning to determine the layout substantially satisfying the constraints. A part of the genome wide haplotype sequence or the genotype sequence of the genome can be generated based on the overlap relations and the randomly located sequence reads. | 11-28-2013 |
20130325360 | METHODS AND PROCESSES FOR NON-INVASIVE ASSESSMENT OF GENETIC VARIATIONS - Provided herein are methods, processes and apparatuses for non-invasive assessment of genetic variations. | 12-05-2013 |
20130338932 | COMPUTATIONAL METHOD FOR MAPPING PEPTIDES TO PROTEINS USING SEQUENCING DATA - A method for proteomic analysis of a biological sample is disclosed, which includes obtaining peptide sequences of proteins in a target list; and identifying proteins in the biological sample by mapping the obtained peptide sequences on proteins in a proteomic database, wherein the target list is determined using information of RNA transcripts in the biological sample and/or the target list is determined using information of RNA transcripts in the biological sample. The peptide sequences are determined using a mass spectrometer. The mapping is performed on a subset of proteins based on the information of RNA transcripts. | 12-19-2013 |
20130338933 | METHODS AND PROCESSES FOR NON-INVASIVE ASSESSMENT OF GENETIC VARIATIONS - Provided herein are methods, processes and apparatuses for non-invasive assessment of genetic variations. | 12-19-2013 |
20130338934 | SYSTEMS AND METHODS FOR PROCESSING NUCLEIC ACID SEQUENCE DATA - The present disclosure provides systems and methods for nucleic acid sequence analysis. A system for processing raw nucleic acid sequence data from a genomic sequencer comprises a data processing server having a housing contained therein one or more processing modules. The one or more processing modules can each comprise an electronic control unit programmed to align nucleic acid sequence data from a genomic sequencing device and perform one or more of variant analysis and structural variant analysis on the nucleic acid sequence data. The system can further comprise a computer server in communication with the processing server. The computer server can be programmed or otherwise configured to process and/or analyze the aligned nucleic acid sequence data. | 12-19-2013 |
20140012513 | POPULATION BASED METHOD OF EVALUATING GENOMIC SEQUENCES - Methods and systems for evaluating genomic sequences are described. The methods include approaches for evaluating the prevalence of genomes in a sample based on the prevalence of segments in the sample, and may additionally rely on the prevalence of segments in reference genomes and an estimated genome population distribution of the sample. | 01-09-2014 |
20140025312 | HIERARCHICAL GENOME ASSEMBLY METHOD USING SINGLE LONG INSERT LIBRARY - The present invention is generally directed to a hierarchical genome assembly process for producing high-quality de novo genome assemblies. The method utilizes a single, long-insert, shotgun DNA library in conjunction with Single Molecule, Real-Time (SMRT®) DNA sequencing, and obviates the need for additional sample preparation and sequencing data sets required for previously described hybrid assembly strategies. Efficient de novo assembly from genomic DNA to a finished genome sequence is demonstrated for several microorganisms using as little as three SMRT® cells, and for bacterial artificial chromosomes (BACs) using sequencing data from just one SMRT® Cell. Part of this new assembly workflow is a new consensus algorithm which takes advantage of SMRT® sequencing primary quality values, to produce a highly accurate de novo genome sequence, exceeding 99.999% (QV 50) accuracy. The methods are typically performed on a computer and comprise an algorithm that constructs sequence alignment graphs from pairwise alignment of sequence reads to a common reference. | 01-23-2014 |
20140052383 | SYSTEMS AND METHODS FOR IDENTIFYING A CONTRIBUTOR'S STR GENOTYPE BASED ON A DNA SAMPLE HAVING MULTIPLE CONTRIBUTORS - Under one aspect of the present invention, a method is provided for analyzing a mixture of DNA from two or more contributors, to identify at least one contributor's STR genotypes at a plurality of STR loci. Possible solutions may be determined independently for each STR locus, each solution including the number of contributors, an STR genotype for each contributor at that locus, an abundance ratio of their respective contributions, and a confidence score. The most likely solutions for the STR locus having the highest confidence score then are used as givens, based upon which the solutions for the other STR loci may be sequentially obtained, in each instance using as givens the most likely solutions for any previously analyzed loci. STR genotypes are output that share as givens the number of contributors and the abundance ratio used in the most likely solution for the last analyzed STR locus. | 02-20-2014 |
20140067280 | Ancestral-Specific Reference Genomes And Uses Thereof - Ancestry has a significant impact on the major and minor alleles found in each nucleotide position within the genome. Due to mechanisms of inheritance, ancestral-specific information contained within the genome is conserved within members of an ancestry. For this reason, individuals within a specific ancestry are more likely to share alleles in their genomes with other members of the same ancestry. Functionally, the combination of alleles at all positions within a group of individuals defines that group as having a common ancestry. Moreover, the aggregation of differences between alleles at all positions distinguishes one ancestry from another. The genomic similarities and differences between ancestries provides a mechanism to generate reference genomes that are specific for each ancestry. Reference genomes that are specific to an ancestry can be used to increase the accuracy of whole genome sequencing, DNA-based diagnostics and therapeutic marker discovery and in a variety of real-world DNA-based applications. | 03-06-2014 |
20140100792 | METHODS AND PROCESSES FOR NON-INVASIVE ASSESSMENT OF GENETIC VARIATIONS - Provided herein are methods, processes and apparatuses for non-invasive assessment of genetic variations. | 04-10-2014 |
20140107937 | SYSTEM AND METHOD FOR GENOTYPE ANALYSIS AND ENHANCED MONTE CARLO SIMULATION METHOD TO ESTIMATE MISCLASSIFICATION RATE IN AUTOMATED GENOTYPING - The present invention relates to methods and systems for the analysis of the dissociation behavior of nucleic acids. The present invention includes methods and systems for analyzing dynamic profiles of genotypes of nucleic acids, including the steps of using a computer, including a processor and a memory, to convert dynamic profiles of known genotypes of a nucleic acid to multi-dimensional data points, wherein the dynamic profiles each comprise measurements of a signal representing a physical change of a nucleic acid containing the known genotype relative to an independent variable; using the computer to reduce the multi-dimensional data points into reduced-dimensional data points; and generating a plot of the reduced-dimensional data points for each genotype. The present invention also relates to methods and systems for calculating error statistics for an assay for identifying a genotype in a biological sample using an enhanced Monte Carlo simulation method to generate a set of N random data points for each known genotype within a class of known genotypes, where each set of N random data points has the same mean data point and covariance matrix as a data set for each of the known genotypes. | 04-17-2014 |
20140107938 | INFORMATION PROCESSING SYSTEM USING NUCLEOTIDE SEQUENCE-RELATED INFORMATION - The present invention provides a highly-safe information processing system that is capable of effectively using nucleotide sequence information differences between individual organisms to offer semantic information useful for each individual organism while properly preventing leakage and illegal use of nucleotide sequence information. | 04-17-2014 |
20140114584 | METHODS AND SYSTEMS FOR IDENTIFYING, FROM READ SYMBOL SEQUENCES, VARIATIONS WITH RESPECT TO A REFERENCE SYMBOL SEQUENCE - The current document is directed to automated methods and processor-controlled systems for assembling short read symbol sequences into longer assembled symbol sequences that are aligned and compared to a reference symbol sequence in order to determine differences between the longer assembled symbol sequences and the reference sequence. These methods and systems are applied to process electronically stored symbol-sequence data. While the symbol-sequence data may represent genetic-code data, the automated methods and processor-controlled systems may be more generally applied to various different symbol-sequence data. In certain implementations, redundancy in read symbol sequences is used to preprocess the read symbol sequences to identify and correct symbol errors. In certain implementations, those corrected read symbol sequences that exactly match subsequences of the reference symbol sequence are identified and removed from subsequent processing steps, to simply the identification of differences between the longer assembled symbol sequences and the reference sequence. | 04-24-2014 |
20140114585 | SYSTEM AND METHOD FOR PROPAGATING INFORMATION USING MODIFIED NUCLEIC ACIDS - Improvement is effected for a nucleic acid-based molecular computing system that is comprised of (i) a nucleic acid structure, (ii) at least one polynucleotide displacement molecule that can bind with the nucleic acid structure under hybridizing conditions, and (iii) a clashing polynucleotide molecule that competes with the polynucleotide displacement molecule for binding the nucleic acid structure under the hybridizing conditions. The method for such improvement entails incorporating chemical modification that inhibits the binding of the clashing molecule and the nucleic acid structure or facilitating the binding of the displacement molecule and the nucleic structure. | 04-24-2014 |
20140121990 | Secure Informatics Infrastructure for Genomic-Enabled Medicine, Social, and Other Applications - A system is disclosed in which human genomes are stored in databases or in a cloud based computer system, which is secure and private and then downloaded to personal devices for possible peer-to-peer interactions for health care applications, as well as for social and other applications. The use of the system is directed to fully sequenced genomes and includes protocols that are constructed to mimic in vitro biological tests to conduct genomic analysis instead of generic computational techniques, which tend to be impractical as they require performance of online computation over the entire genome. Three specific examples of protocols or techniques for privacy-preserving testing on fully sequenced genomes included are: 1) privacy-preserving genetic paternity testing, 2) privacy-preserving personalized medicine testing, and 3) privacy-preserving genetic compatibility testing. | 05-01-2014 |
20140121991 | SYSTEM AND METHOD FOR ALIGNING GENOME SEQUENCE - A system and a method for aligning a genome sequence are provided. The system for aligning a genome sequence includes a fragment sequence production unit configured to produce a plurality of fragment sequences from a read, a filtering unit configured to constitute a candidate fragment sequence group including only the fragment sequences mapped to a reference sequence among the plurality of produced fragment sequences, a mapping number calculation unit configured to divide the reference sequence into a plurality of sections and calculate total mapping numbers of the candidate fragment sequences for the sections, and an alignment unit configured to select the sections in which the calculated total mapping numbers are greater than or equal to a reference number and perform global alignment on the read with respect to the selected sections. | 05-01-2014 |
20140121992 | SYSTEM AND METHOD FOR ALIGNING GENOME SEQUENCE - A system and a method for aligning genome sequence are provided. The system for aligning genome sequence includes a mapping position calculation unit configured to select one of a plurality of seeds produced from a read and calculate a mapping position of the selected seed in a target sequence, and a global alignment unit configured to calculate a repeat judgment region for the selected seed from the calculated mapping position, determine whether global alignment is pre-performed in the calculated repeat judgment region and perform global alignment the selected read at the calculated mapping position when the global alignment is not pre-performed. | 05-01-2014 |
20140121993 | CONSIDERATION OF EVIDENCE - In many situations, particularly in forensic science, there is a need to consider one piece of evidence against one or more other pieces of evidence. For instance, it may be desirable to compare a sample collected from a crime scene with a sample collected from a person, with a view to linking the two by comparing the characteristics of their DNA, particularly by expressing the strength or likelihood of the comparison made, a so called likelihood ratio. The method provides a more accurate or robust method for establishing likelihood ratios through the definitions of the likelihood ratios used and the manner in which the probability distribution functions for use in establishing likelihood ratios are obtained The methods provide due consideration of stutter and/or dropout of alleles in DNA analysis, as well as taking into consideration one or more peak imbalance effects, such as degradation, amplification efficiency, sampling effects and the like. | 05-01-2014 |
20140136120 | DIRECT IDENTIFICATION AND MEASUREMENT OF RELATIVE POPULATIONS OF MICROORGANISMS WITH DIRECT DNA SEQUENCING AND PROBABILISTIC METHODS - The present invention relates to systems and methods capable of characterizing populations of organisms within a sample. The characterization may utilize probabilistic matching of short strings of sequencing information to identify genomes from a reference genomic database to which the short strings belong. The characterization may include identification of the microbial community of the sample to the species and/or sub-species and/or strain level with their relative concentrations or abundance. In addition, the system and methods may enable rapid identification of organisms including both pathogens and commensals in clinical samples, and the identification may be achieved by a comparison of many (e.g., hundreds to millions) metagenomic fragments, which have been captured from a sample and sequenced, to many (e.g., millions or billions) of archived sequence information of genomes (i.e., reference genomic databases). | 05-15-2014 |
20140136121 | METHOD FOR ASSEMBLING SEQUENCED SEGMENTS - The present invention relates to a method for optimizing the assembled result of sequencing data using a genetic map. In particular, provided in the present invention is a new method for assembling individual sequenced segments, which comprises the step of constructing the genetic map with a genetic marker. Furthermore, also provided in the present invention is a method for assembling the individual sequenced segments into a genome sequence, such as a chromosome sequence. | 05-15-2014 |
20140149048 | FINGERPRINT FOR CELL IDENTITY AND PLURIPOTENCY - A method for determining a replication timing footprint comprises the following steps: (a) selecting a set of chosen regions of the replication timing profile of a chromosome of an individual, (b) choosing a set of selected regions from the set of chosen regions to form a set of selected regions and a set of unused regions, (c) conducting a iterative algorithm on the set of selected regions until a domain number for the set of selected regions has decreased to a predetermined minimum, (d) determining a replication timing footprint based the set of selected regions after step (c) has been conducted, and (e) displaying the replication timing footprint to a user. | 05-29-2014 |
20140149049 | ACCURATE AND FAST MAPPING OF READS TO GENOME - Accurate and fast mapping of sequencing reads obtained from a targeted sequencing procedure can be provided. Once a target region is selected, alternate regions of the genome that are sufficiently similar to the target region can be identified. If a sequencing read is more similar to the target region than to an alternate region, then the read can be determined as aligning to the target region. The reads aligning to the target region can then be analyzed to determine whether a mutation exists in the target region. Accordingly, a sequencing read can be compared to the target region and the corresponding alternate regions, and not to the entire genome, thereby providing computational efficiency. | 05-29-2014 |
20140163900 | ANALYZING SHORT TANDEM REPEATS FROM HIGH THROUGHPUT SEQUENCING DATA FOR GENETIC APPLICATIONS - Provided herein are methods and related compositions using short tandem repeat (STR) regions for genetic applications. | 06-12-2014 |
20140172320 | STABLE GENES IN COMPARATIVE TRANSCRIPTOMICS - Various embodiments perform stable gene analysis of transcriptome sequencing data. In one embodiment, a plurality of datasets each including transcriptome sequencing data are received by a processor. Each of the plurality of datasets includes a plurality of genes and a respective ranking value for each of the plurality of genes. A plurality of rank normalized input datasets is generated based on assigning, for each of the plurality of datasets, a rank to each of the plurality of genes. One or more longest increasing subsequence (LIS) of ranks are identified between each pair of the plurality of rank normalized input datasets. A set of stable genes from the plurality of genes is identified based on each of the one or more LIS of ranks across the plurality of rank normalized input datasets. | 06-19-2014 |
20140207386 | METHOD AND APPARATUS OF ALIGNING A READ SEQUENCE - Provided are a method of aligning a read sequence relative to a reference sequence using a seed and a read-sequence aligning apparatus using the same. The apparatus may include a seed generating unit producing seeds from read sequences, a representative seed selecting unit grouping the seeds into a plurality of seed clusters and selecting representative seeds from the plurality of seed clusters, a seed aligning unit aligning the representative seeds relative to a reference sequence, and a read-sequence aligning unit aligning the read sequences relative to the reference sequence, with reference to the alignment result of the representative seeds. The read sequence alignment may be performed using relationship between seeds, and thus, the sequencing may be performed with improved efficiency. | 07-24-2014 |
20140249764 | Method for Assembly of Nucleic Acid Sequence Data - The present invention relates to a method for assembly of nucleic acid sequence data comprising nucleic acid fragment reads into (a) contiguous nucleotide sequence segment(s), comprising the steps of: (a) obtaining a plurality of nucleic acid sequence data from a plurality of nucleic acid fragment reads; (b) aligning said plurality of nucleic acid sequence data to a reference sequence; (c) detecting one or more gaps or regions of non-assembly, or non-matching with the reference sequence in the alignment output of step (b); (d) performing de novo sequence assembly of nucleic acid sequence data mapping to said gaps or regions of non-assembly; and (e) combining the alignment output of step (b) and the assembly output of step (d) in order to obtain (a) contiguous nucleotide sequence segment(s). In addition, a corresponding program element or computer program for assembly of nucleic acid sequence data and a sequence assembly system for transforming nucleic acid sequence data comprising nucleic acid fragment reads into (a) contiguous nucleotide sequence segment(s) is provided. | 09-04-2014 |
20140257710 | METHOD AND SYSTEM FOR ANALYZING THE TAXONOMIC COMPOSITION OF A METAGENOME IN A SAMPLE - Provided herein are methods and systems for rapid identification and quantification of the taxonomic composition of a microbial metagenome in a sample, based on compositional spectra analysis. The methods and systems are useful in diagnostic and analytic methods in the clinic and in the field. | 09-11-2014 |
20140288851 | METHOD FOR SEQUENCE RECOMBINATION AND APPARATUS FOR NGS - Provided are a sequence recombination method for NGS and an apparatus thereof. According to an embodiment of the present, a fragment sequence having a length of n is divided into six fragments of an equal sequence length, and then three fragments located in a preceding part of the fragment sequence among the six fragments of an equal sequence length are used as a seed to search for a mapping position candidate by searching for a hash table which is generated on the basis of a reference sequence. | 09-25-2014 |
20140309944 | BIOINFORMATICS SYSTEMS, APPARATUSES, AND METHODS EXECUTED ON AN INTEGRATED CIRCUIT PROCESSING PLATFORM - A system, method and apparatus for executing a sequence analysis pipeline on genetic sequence data includes an integrated circuit formed of a set of hardwired digital logic circuits that are interconnected by physical electrical interconnects. One of the physical electrical interconnects forms an input to the integrated circuit connected with an electronic data source for receiving reads of genomic data. The hardwired digital logic circuits are arranged as a set of processing engines, each processing engine being formed of a subset of the hardwired digital logic circuits to perform one or more steps in the sequence analysis pipeline on the reads of genomic data. Each subset of the hardwired digital logic circuits is formed in a wired configuration to perform the one or more steps in the sequence analysis pipeline. | 10-16-2014 |
20140309945 | GENOME SEQUENCE ALIGNMENT APPARATUS AND METHOD - Provided are a sequence alignment apparatus and method for searching a reference sequence for a candidate position matching with a fragment that is a portion of a read sequence, and mapping the reference sequence and the read sequence to each other based on the candidate position. Accordingly, it is possible to form an alignment permitting all variations and errors that may exist in a read sequence, to search the entire area of a read sequence for variations and errors, and to form an alignment with less computation without permitting backtracking, unlike existing sequence alignment technology. | 10-16-2014 |
20140316716 | Methods, Systems, and Computer Readable Media for Improving Base Calling Accuracy - A method includes exposing template polynucleotide strands, sequencing primers, and polymerase to flows of nucleotide species; obtaining a series of measured intensity values and randomly selecting a training subset therefrom; generating series of base calls using a base caller and aligning the series of base calls to a reference genome or sequence using an aligner; determining intensity value thresholds and parameters of a linear transformation corresponding to different homopolymer lengths and nucleotide species; generating series of base calls corresponding to the series of measured intensity values using at least some of the parameters of a linear transformation; and recalibrating the series of base calls corresponding to the plurality of series of measured intensity values using at least some of the intensity value thresholds. | 10-23-2014 |
20140336949 | METHOD, APPARATUS, AND KIT FOR ANALYZING GENES - The conventional DNA sequencers for analyzing nucleotide sequences have no function of detecting minute polymorphisms. Any cross talk in the wavelengths of fluorescent substances for labeled DNA fragments hinders detection of weak-strength signals at the same coordinates, making it difficult to detect genetic mutations with small existence ratios, for example, in somatic mutations. Disclosed is a gene analyzer composed of a plurality of flow channels, each of which is used to electrophorese nucleic acid samples labeled for each of nucleotide types; a chromatogram data creating part for detecting a labeled signal for each of the nucleotide types for each of the nucleic acid samples in each of the plurality of flow channels and creating chromatogram data on signal strengths detected; a peak detection part for the peal values in the chromatogram data for each of the nucleotide types; and a data integrating part for integrating a plurality of chromatogram data. | 11-13-2014 |
20140336950 | CLUSTERING COPY-NUMBER VALUES FOR SEGMENTS OF GENOMIC DATA - Clustering methods are disclosed including a hidden Markov model (HMM) based clustering algorithm having particular applicability for identifying tumor subtypes using array comparative genomic hybridization (aCGH) DNA copy number data. In one embodiment, clusters of tumor samples are modeled with a mixture of HMMs where each HMM fits a cluster of samples. With respect to this embodiment, a computationally efficient and fast clustering algorithm takes only a computational time of O(n), has less than half the error rate of non-negative matrix factorization (NMF) clustering, and can locate the optimal number of groups automatically (e.g., as applied to a data set including glioma aCGH data). | 11-13-2014 |
20140343868 | METHOD AND SYSTEM FOR GENOME IDENTIFICATION - The present invention belongs to the field of genomics and nucleic acid sequencing. It involves a novel method of sequencing biological material and real-time probabilistic matching of short strings of sequencing information to identify all species present in said biological material. It is related to real-time probabilistic matching of sequence information, and more particular to comparing short strings of a plurality of sequences of single molecule nucleic acids, whether amplified or unamplied, whether chemically synthesized or physically interrogated, as fast as the sequence information is generated and in parallel with continuous sequence information generation or collection. | 11-20-2014 |
20140350866 | Method of Gap Closing in Nucleotide Sequence and Apparatus Thereof - Provided is a method of gap closing in nucleotide sequence. The nucleic acid sequence comprises a first contig at one end of a gap in an unassembled region, and a second contig at the other end of the gap in the unassembled region. The method comprises: selecting reads having an overlap with one end of the first contig close to the gap as a set of reads for gap closing; selecting reads having a shortest overlap with the first contig in the set of reads for gap closing as a candidate read; determining whether reads having an overlapping length with the first contig shorter than an overlapping length between the candidate read and the first contig present in the set of reads for gap closing, and determining whether reads having no overlapping relationship with the candidate read present in the set of reads for gap closing; obtaining a result of presenting an extension conflict, and determining an unconfident candidate read, if reads having an overlapping length with the first contig shorter than an overlapping length between the candidate read and the first contig present in the set of reads for gap closing, reads having no overlapping relationship with the candidate read present in the set of reads for gap closing, or both reads having an overlapping length with the first contig shorter than an overlapping length between the candidate read and the first contig, and reads having no overlapping relationship with the candidate read present in the set of reads for gap closing; reselecting the candidate read until obtaining a confident candidate read, if the candidate read is unconfident; connecting the confident candidate read to the first contig, to form a new first contig; determining whether one end of the new first contig close to the gap has an overlap with one end of the second contig close to the gap; performing the step of selecting the set of reads for gap closing on the basis of the new first contig, if the one end of the new first contig close to the gap has no overlap with the one end of the second contig close to the gap, wherein the first contig in the step of selecting the set of reads for gap closing is replaced with the new first contig; connecting the new first contig to the second contig to complete gap closing, if one end of the new first contig close to the gap has an overlap with one end of the second contig close to the gap. | 11-27-2014 |
20140372046 | APPARATUS AND METHOD FOR MANAGING GENETIC INFORMATION - A genetic information managing apparatus compares a base sequence of a subject with a standard base sequence to determine a longest common base sequence, and arranges the base sequence of the subject on the standard base sequence in accordance with the longest common base sequence. The genetic information managing apparatus divides the arranged base sequence into a plurality of base code groups, allocates a plurality of identifiers to the plurality of base code groups, respectively, and stores the plurality of base code groups to a plurality of storing units in association with corresponding identifiers. | 12-18-2014 |
20150032385 | Methods of Analyzing Massively Parallel Sequencing Data - In at least one illustrative embodiment, a method may comprise selecting a first plurality of text strings that each represent a nucleotide sequence that was read by a massively parallel sequencing instrument, where the nucleotide sequences represented by the selected first plurality of text strings each correspond to a first target locus, comparing the selected first plurality of text strings to one another to determine an abundance count for each unique text string included in the selected first plurality of text strings, identifying a first number of unique text strings included in the selected first plurality of text strings as representing noise responses, and determining a method detection limit as a function of the abundance counts for the first number of unique text strings identified as representing noise responses. | 01-29-2015 |
20150057946 | METHODS AND SYSTEMS FOR ALIGNING SEQUENCES - The invention includes methods for aligning reads (e.g., nucleic acid reads, amino acid reads) to a reference sequence construct, methods for building the reference sequence construct, and systems that use the alignment methods and constructs to produce sequences. The method is scalable, and can be used to align millions of reads to a construct thousands of bases or amino acids long. The invention additionally includes methods for identifying a disease or a genotype based upon alignment of nucleic acid reads to a location in the construct. | 02-26-2015 |
20150057947 | LONG FRAGMENT DE NOVO ASSEMBLY USING SHORT READS - Techniques perform de novo assembly. The assembly can use labels that indicate origins of the nucleic acid molecules. For example, a representative set of labels identified from initial reads that overlap with a seed can be used. Mate pair information can be used. A sequence read that aligns to an end of a contig can lead to using the other sequence read of a mate pair, and the other sequence read can be used to determine which branch to use to extend, e.g., in an external cloud or helper contig. A kmer index can include labels indicating an origin of each of the nucleic acid molecules that include each kmer, memory addresses of the reads that correspond to each kmer in the index, and a position in each of the mate pairs that includes the kmer. Haploid seeds can also be determined using polymorphic loci identified in a population. | 02-26-2015 |
20150066383 | COLLAPSIBLE MODULAR GENOMIC PIPELINE - The invention generally relates to tools for genomic analysis and particularly to a pipeline editor that can turn pipelines into standalone tools for use in other pipelines. The invention provides systems and methods for genomic analysis in which individual analytical tools can be arranged into analytical pipelines that can then be “collapsed” into standalone tools, which themselves can be put into the pool of individual tools for use in further building of pipelines. Aspects of the invention provide a system that includes a server computer system operable to present to a user a plurality of genomic tools, receive input from the user arranging the tools into a pipeline, create a new tool that includes the pipeline, and offer the new tool along with the plurality of genomic tools. | 03-05-2015 |
20150066384 | SYSTEM AND METHOD FOR ALIGNING GENOME SEQUENCE - Provided are a system and method for sequence alignment. The system for sequence alignment includes an exact matching module configured to perform exact matching of an input read to a reference sequence, a secondary matching module configured to map the read to the reference sequence in consideration of mismatches between the read and the reference sequence when the read does not exactly match the reference sequence, and a global alignment module configured to perform global alignment operation of the read with the reference sequence when the read is not mapped to the reference sequence by the secondary matching module. | 03-05-2015 |
20150066385 | SEQUENCING METHODS - The invention described herein solves challenges in providing a proficient, rapid and meaningful analysis of sequencing data. Methods and computer program products of the invention allow for a system to receive, analyze, and display sequencing data in real-time. The invention provides solutions to several difficulties encountered in assembling short sequencing-reads, and by doing so the invention improves the worth and significance of sequencing data. | 03-05-2015 |
20150073724 | METHOD FOR FINDING VARIANTS FROM TARGETED SEQUENCING PANELS - Provided herein is a method for identifying a sequence variant in an enriched sample. In certain embodiments, this method may comprise: (a) obtaining: (i) a plurality of sequence reads from a sample that has been enriched for a genomic region and (ii) a reference sequence for the genomic region; (b) assembling the sequence reads to obtain a plurality of discrete sequence assemblies that correspond to potential variants; (c) determining which of the potential variants are true and which are artifacts by examining the sequence reads that make up each of the discrete sequence assemblies; (d) optionally determining whether each of the true potential variants contains a mutation that is known to be associated with the reference sequence; and (e) outputting a report indicating whether the sample comprises a sequence variant. | 03-12-2015 |
20150088432 | SYSTEMS, METHODS, AND COMPOSITIONS FOR VIRAL-ASSOCIATED TUMORS - Contemplated systems and methods employ chimeric reference sequences that include a plurality of viral genome sequences to identify/quantify integration and co-amplification events. Most typically, the viral genome sequences are organized in the chimeric reference sequences as single chromosomes and the chimeric reference sequences are in BAM format. | 03-26-2015 |
20150094963 | BAMBAM: PARALLEL COMPARATIVE ANALYSIS OF HIGH-THROUGHPUT SEQUENCING DATA - A differential sequence object is constructed on the basis of alignment of sub-strings via incremental synchronization of sequence strings using known positions of the sub-strings relative to a reference genome sequence. An output file is then generated that comprises only relevant changes with respect to the reference genome. | 04-02-2015 |
20150100247 | METHODS AND SYSTEMS FOR MODELING PHASING EFFECTS IN SEQUENCING USING TERMINATION CHEMISTRY - A method for nucleic acid sequencing includes receiving observed or measured nucleic acid sequencing data from a sequencing instrument that receives and processes a sample nucleic acid in a termination sequencing-by-synthesis process. The method also includes generating a set of candidate sequences of bases for the observed or measured nucleic acid sequencing data by determining a predicted signal for candidate sequences using a simulation framework. The simulation framework incorporates an estimated carry forward rate (CFR), an estimated incomplete extension rate (IER), an estimated droop rate (DR), an estimated reactivated molecules rate (RMR), and an estimated termination failure rate (TFR), the RMR being greater than or equal to zero and the TFR being lesser than one. The method also includes identifying, from the set of candidate sequences of bases, one candidate sequence leading to optimization of a solver function as corresponding to the sequence for the sample nucleic acid. | 04-09-2015 |
20150120209 | METHOD, COMPUTER-ACCESSIBLE MEDIUM, AND SYSTEMS FOR GENERATING A GENOME WIDE HAPLOTYPE SEQUENCE - Methods, computer-accessible medium, and systems for generating a genome wide probe map and/or a genome wide haplotype sequence are provided. In particular, a genome wide probe map can be generated by obtaining a plurality of detectable oligonucleotide probes hybridized to at least one double stranded nucleic acid molecule cleaved with at least one restriction enzyme, and detecting the location of the detectable oligonucleotide probes. For example, genome wide haplotype sequence can be generated by analyzing at least one genome wide restriction map in conjunction with at least one genome wide probe map to determine distances between restriction sites of the genome wide restriction map(s) and locations of detectable oligonucleotide probes of the genome wide probe map(s) and defining a consensus map indicating restriction sites based on the genome wide restriction map(s) and/or locations of detectable oligonucleotide probes based on each of the genome wide probe map(s). | 04-30-2015 |
20150120210 | METHOD AND DEVICE FOR LABELLING SINGLE NUCLEOTIDE POLYMORPHISM SITES IN GENOME - Disclosed are a method and a device for labelling single nucleotide polymorphism site in a genome. The above-mentioned method comprises: the single-end RAD sequences from the genomes of two individuals are obtained; the single-end RAD sequences are filtered to remove unqualified sequences; the sequencing depth of the sequences from the genomes of two individuals is aligned in pairs and without gaps to determine the SNP sites. | 04-30-2015 |
20150142334 | SYSTEM, METHOD AND COMPUTER-ACCESSIBLE MEDIUM FOR GENETIC BASE CALLING AND MAPPING - RNA sequencing techniques provide rapid base-calling and resequencing for improved bio-informatics. Exemplary embodiments of computer-implemented systems and methods can be provided, as applied to RNA sequence interpretation, enumeration and classification, etc., by defining a map of the transcripts encoded in a genome, and measuring their relative abundances | 05-21-2015 |
20150302144 | HIERARCHICAL GENOME ASSEMBLY METHOD USING SINGLE LONG INSERT LIBRARY - The present invention is generally directed to a hierarchical genome assembly process for producing high-quality de novo genome assemblies. The method utilizes a single, long-insert, shotgun DNA library in conjunction with Single Molecule, Real-Time (SMRT®) DNA sequencing, and obviates the need for additional sample preparation and sequencing data sets required for previously described hybrid assembly strategies. Efficient de novo assembly from genomic DNA to a finished genome sequence is demonstrated for several microorganisms using as little as three SMRT® cells, and for bacterial artificial chromosomes (BACs) using sequencing data from just one SMRT® Cell. Part of this new assembly workflow is a new consensus algorithm which takes advantage of SMRT® sequencing primary quality values, to produce a highly accurate de novo genome sequence, exceeding 99.999% (QV 50) accuracy. The methods are typically performed on a computer and comprise an algorithm that constructs sequence alignment graphs from pairwise alignment of sequence reads to a common reference. | 10-22-2015 |
20150324519 | RARE VARIANT CALLS IN ULTRA-DEEP SEQUENCING - Accurate variant calling methods for low frequency variants are provided. Sequence reads of targeted ultra-deep sequencing are received and aligned to a reference sequence. Read depths and variant counts for variants of the same class at each location where the reference allele exists on the reference sequence are determined for each sample-amplicon. Based on the read depths and variant counts, a probability value indicating the confidence level that a specific variant at a specific location is a true positive is calculated using methods such as a statistical model based method and a localized method using a reference sample. The probability value is then compared with a threshold level to determine whether the detected variants are true positives. | 11-12-2015 |
20150331994 | VISUALIZATION OF NUCLEIC ACID SEQUENCES - A system and process are provided for analyzing nucleic acid data. An example process can include receiving nucleic acid data including a set of sequence data. The nucleotides of the sequence data can be assigned numerical values. Using these assigned values, partial sums can be calculated for each position in the set of sequence data. The resulting sums can then be displayed in form of Charts or Maps which is so called sequence spectrum to make it easy to navigate and analyze the whole data set. In some examples, patterns or similar/identical sequence segments can be identified within a single set of sequence data or between different sets of sequence data in the spectrum. | 11-19-2015 |
20150347676 | CHROMOSOME REPRESENTATION DETERMINATIONS - Technology described herein pertains in part to diagnostic tests that make use of sequence reads generated by a sequencing process. In some embodiments, a component used to generate a chromosome representation can be based on counts of sequence reads not aligned to a reference genome. | 12-03-2015 |
20150356239 | METHODS, MODELS, SYSTEMS, AND APPARATUS FOR IDENTIFYING TARGET SEQUENCES FOR CAS ENZYMES OR CRISPR-CAS SYSTEMS FOR TARGET SEQUENCES AND CONVEYING RESULTS THEREOF - Disclosed are thermodynamic and multiplication methods concerning CRISPR-Cas systems, and apparatus therefor. | 12-10-2015 |
20150356243 | SYSTEMS AND METHODS FOR IDENTIFYING POLYMORPHISMS - The present invention relates to processes, systems and methods for estimating the effects of genetic polymorphisms associated with traits and diseases, based on distributions of observed effects across multiple loci. In particular, the present invention provides systems and methods for analyzing genetic variant data including estimating the proportion of polymorphisms truly associated with the phenotypes of interest, the probability that a given polymorphism has a true association with the phenotypes of interest, and the predicted effect size of a given genetic variant in independent de novo samples given effect size distributions in observed samples. The present invention also relates to using the described systems and methods and use of genetic polymorphisms across a plurality of loci and a plurality of phenotypes to diagnose, characterize, optimize treatment and predict diseases and traits. | 12-10-2015 |
20150370963 | Phased Whole Genome Genetic Risk In A Family Quartet - In an embodiment of the present invention, three novel human reference genome sequences were developed based on the most common population-specific DNA sequence (“major allele”). Methods were developed for their integration into interpretation pipelines for highthroughput whole genome sequencing. | 12-24-2015 |
20150376703 | METHOD AND SYSTEM TO PREDICT RESPONSE TO PAIN TREATMENTS - The present invention relates to systems and methods for predicting an individuals likely response to a pain medication comprising genotyping genetic variations in an individual to determine the individual's propensity for metabolizing a pain medication and likely response to a medication, and preferably diverse reactions to a medication. In particular, the invention comprises analyzing a biological sample provided by an individual, typically a patient or an individual diagnosed with a particular disorder, determining the individual's likely response to a particular treatment, more specifically a pain medication, and thereafter displaying, or further, recommending a plan of action or inaction. In particular, the present invention provides a grading method and system to profile an individual's response to one or more pain medication. | 12-31-2015 |
20150376710 | METHODS OF EVALUATING RESPONSE TO CANCER THERAPY - A method of evaluating a cancer patient comprising evaluating gene expression levels in a patient sample, calculating a predictor score using the gene expression levels, and assessing the likelihood of a therapeutic outcome using the predictor score is disclosed. | 12-31-2015 |
20150379192 | Processing and Analysis of Complex Nucleic Acid Sequence Data - The present invention is directed to logic for analysis of nucleic acid sequence data that employs algorithms that lead to a substantial improvement in sequence accuracy and that can be used to phase sequence variations, e.g., in connection with the use of the long fragment read (LFR) process. | 12-31-2015 |
20150379194 | METHODS FOR ACCURATE SEQUENCE DATA AND MODIFIED BASE POSITION DETERMINATION - Disclosed herein are methods of determining the sequence and/or positions of modified bases in a nucleic acid sample present in a circular molecule with a nucleic acid insert of known sequence comprising obtaining sequence data of at least two insert-sample units. In some embodiments, the methods comprise obtaining sequence data using circular pair-locked molecules. In some embodiments, the methods comprise calculating scores of sequences of the nucleic acid inserts by comparing the sequences to the known sequence of the nucleic acid insert, and accepting or rejecting repeats of the sequence of the nucleic acid sample according to the scores of one or both of the sequences of the inserts immediately upstream or downstream of the repeats of the sequence of the nucleic acid sample. | 12-31-2015 |
20150379196 | PROCESSES AND SYSTEMS FOR NUCLEIC ACID SEQUENCE ASSEMBLY - Methods, processes, and particularly computer implemented processes and computer program products are provided for use in the analysis of genetic sequence data. The processes and products are employed in the assembly of shorter nucleic acid sequence data into longer linked and preferably contiguous genetic constructs, including large contigs, chromosomes and whole genomes. | 12-31-2015 |
20160004816 | Spatial Arithmetic Method of Sequence Alignment - A computer system aligns two or more sequences with each other to identify similarities and differences between the aligned sequences. The sequences may, for example, represent proteins. The system performs alignment quickly and accurately by representing the sequences as perceptual information and conceptual information having mappings between them in a knowledgebase, and then performing the alignment based on the representations of the sequences in the knowledgebase. The alignment may be performed in polynomial time, regardless of the number of sequences that are aligned. | 01-07-2016 |
20160004817 | SYSTEMS AND METHODS FOR IDENTIFYING SIGNIFICANTLY MUTATED GENES - The invention relates to method for identifying significantly mutated genes includes determining a false discovery rate for each of the genes. The method may include estimating local mutation rates for the genes by converting each covariate to a centered and normalized score. The method may also include estimating a local background mutation rate for each of the genes, which may be estimated from silent and/or noncoding mutations of each of the genes itself. In some embodiments, the local background mutation rate may be estimated additionally from one or more neighbor genes in a covariate space. Related systems, techniques, and articles are also encompassed by the present invention. | 01-07-2016 |
20160012181 | METHOD FOR ASSIGNING A QUALITATIVE IMPORTANCE OF RELEVANT GENETIC PHENOTYPES TO THE USE OF SPECIFIC DRUGS FOR INDIVIDUAL PATIENTS BASED ON GENETIC TEST RESULTS | 01-14-2016 |
20160019338 | DETECTING FETAL SUB-CHROMOSOMAL ANEUPLOIDIES - Disclosed are methods for determining copy number variation (CNV) known or suspected to be associated with a variety of medical conditions, including syndromes related to CNV of subchromosomal regions. In some embodiments, methods are provided for determining CNV of fetuses using maternal samples comprising maternal and fetal cell free DNA. Some embodiments disclosed herein provide methods to improve the sensitivity and/or specificity of sequence data analysis by removing within-sample GC-content bias. In some embodiments, removal of within-sample GC-content bias is based on sequence data corrected for systematic variation common across unaffected training samples. In some embodiments, syndrome related biases in sample data are also removed to increase signal to noise ratio. Also disclosed are systems for evaluation of CNV of sequences of interest. | 01-21-2016 |
20160019339 | BIOINFORMATICS TOOLS, SYSTEMS AND METHODS FOR SEQUENCE ASSEMBLY - A method for genetic sequence assembly may include identifying a reference string of nucleobases digitally expressed in a first Mercator data structure having k rows by four columns, wherein k is a number of nucleobases in said string and each column attribute corresponds to a nucleobase residue by type; creating a first plurality of reference signatures of a predetermined length for the reference string; receiving an input string of nucleobases to be sequenced; creating a digital expression of the input string in a second Mercator data structure having k rows by four columns; creating a second plurality of reference signatures of the predetermined length for the input string; comparing each of the second plurality of reference signatures with each of the first plurality of reference signatures to identify possible matches of the second plurality of reference signatures with the first plurality of reference signatures; and identifying a match between at least one of the second plurality of reference signatures with at least one of the first plurality of reference signatures. | 01-21-2016 |
20160019340 | SYSTEMS AND METHODS FOR DETECTING STRUCTURAL VARIANTS - Systems and method for identifying gene fusions can obtain sequencing information for a plurality of amplicons from a nucleic acid sample. The sequencing information can include a plurality of reads that are initially partially mapped to a reference sequence. Fragments may be generated by splitting the partially mapped reads into mapped and unmapped fragments, and the fragments may be remapped to the reference sequence. Gene fusions can be identified based on reads where the first fragment maps to a first gene and the second fragment maps to a second gene. | 01-21-2016 |
20160019341 | METHODS AND SYSTEMS FOR GENOMIC ANALYSIS - A computer-implemented method for processing and/or analyzing nucleic acid sequencing data comprises receiving a first data input and a second data input. The first data input comprises untargeted sequencing data generated from a first nucleic acid sample obtained from a subject. The second data input comprises target-specific sequencing data generated from a second nucleic acid sample obtained from the subject. Next, with the aid of a computer processor, the first data input and the second data input are combined to produce a combined data set. Next, an output derived from the combined data set is generated. The output is indicative of the presence or absence of one or more polymorphisms of the first nucleic acid sample and/or the second nucleic acid sample. | 01-21-2016 |
20160026752 | Systems And Methods For Comprehensive Analysis Of Molecular Profiles Across Multiple Tumor And Germline Exomes - Omics patient data are analyzed using sequences or diff objects of tumor and matched normal tissue to identify patient and disease specific mutations, using transcriptomic data to identify expression levels of the mutated genes, and pathway analysis based on the so obtained omic data to identify specific pathway characteristics for the diseased tissue. Most notably, many different tumors have shared pathway characteristics, and identification of a pathway characteristic of a tumor may thus indicate effective treatment options ordinarily not considered when tumor analysis is based on anatomical tumor type only. | 01-28-2016 |
20160026756 | METHOD AND APPARATUS FOR SEPARATING QUALITY LEVELS IN SEQUENCE DATA AND SEQUENCING LONGER READS - Sequencing reads from a measurement system may be classified based on quality scores associated with the measurement system, and corresponding error characteristics may be provided. The sequencing reads may correspond to at least one of deoxyribonucleic acid (DNA), complementary DNA (cDNA), or ribonucleic acid (RNA). | 01-28-2016 |
20160034639 | RAPID ISOLATION OF MONOCLONAL ANTIBODIES FROM ANIMALS - Methods and compositions for identification of candicate antigen-specific variable regions as well as generation of antibodies or antigen-binding fragments that could have desired antigen specificity are provided. For example, in certain aspects methods for determining amino acid sequences of serum antibody CDR and abundancy level are described. In some aspects, methods for determining nucleic acid sequences of antibody variable region sequences and frequency are provided. Furthermore, the invention provides methods for identification and generation of antibody or antigen-binding fragments that comprise highly-represented CDR. | 02-04-2016 |
20160078170 | Bioinformatics Systems, Apparatuses, And Methods Executed On An Integrated Circuit Processing Platform - A system, method and apparatus for executing a sequence analysis pipeline on genetic sequence data includes a structured ASIC formed of a set of hardwired digital logic circuits that are interconnected by physical electrical interconnects. One of the physical electrical interconnects forms an input to the structured ASIC connected with an electronic data source for receiving reads of genomic data. The hardwired digital logic circuits are arranged as a set of processing engines, each processing engine being formed of a subset of the hardwired digital logic circuits to perform one or more steps in the sequence analysis pipeline on the reads of genomic data. Each subset of the hardwired digital logic circuits is formed in a wired configuration to perform the one or more steps in the sequence analysis pipeline. | 03-17-2016 |
20160085911 | METHOD AND DEVICE FOR DETECTING CHROMOSOMAL STRUCTURAL ABNORMALITIES - A method and a device for detecting chromosomal structural abnormalities are provided. The method includes acquiring a whole genome sequencing result of a target individual, that is, multiple pairs of Reads located at two ends of chromosome fragments are determined; aligning the sequencing result with a reference sequence to obtain an abnormal match set, which includes Read pairs that have two Read sequences matched respectively to different chromosomes of the reference sequence; clustering the Read sequences in the abnormal match set based on the positions matched thereto; and filtering the resultant clusters by using, for example, preset requirements associated with compactness and others, and obtaining the filtered result, clusters, for determining the occurrence of translocation-type chromosomal structural abnormity. | 03-24-2016 |
20160098517 | APPARATUS AND METHOD FOR DETECTING INTERNAL TANDEM DUPLICATION - According to an illustrative embodiment, provided herein is an internal tandem duplication (ITD) detection apparatus which includes a breakpoint identification unit for identifying two breakpoints in a reference genome sequence based on a plurality of reads, each of which partially matches the reference genome sequence; and an ITD detection unit for generating an ITD reference sequence which includes a base sequence portion spanning between the two breakpoints in the reference genome sequence and a sequential repetition of the base sequence portion. | 04-07-2016 |
20160102348 | METHODS, SYSTEMS, AND COMPUTER-READABLE MEDIA FOR ACCELERATED BASE CALLING - Embodiments disclose methods, systems, and computer-readable media for accelerated base calling of sequencing data. These methods may be adapted to accelerate sequence determination for data arising from a variety of different nucleic acid sequencing platforms. In various embodiments, configurable logic circuits such as FPGAs and GPUs may be adapted to perform raw signal processing, basecalling, and/or sequence determination operations providing further enhancements to the sequence analysis methods. | 04-14-2016 |
20160110497 | METHODS AND PROCESSES FOR NON-INVASIVE ASSESSMENT OF GENETIC VARIATIONS - Provided herein are methods, processes, apparatuses and machines for non-invasive assessment of genetic variations. | 04-21-2016 |
20160125131 | POOL TEST RESULT VERIFICATION METHOD AND APPARATUS - Provided are a method and apparatus for verifying pool test result. The method includes receiving pool test result data obtained by performing a pool test on a plurality of pools configured based on a two-dimensional (2D) matrix, the pool test result data including allele frequencies of the plurality of pools, extracting a pool-specific variant from the matrix using the allele frequencies determining whether there is an intersecting pool among pools intersecting, in the matrix, a pool showing the pool-specific variant and determining whether the pool test result data is erroneous based on results of the determining whether there is an intersecting pool. | 05-05-2016 |
20160154929 | NEXT GENERATION SEQUENCING ANALYSIS SYSTEM AND NEXT GENERATION SEQUENCING ANALYSIS METHOD THEREOF | 06-02-2016 |
20160162635 | METHOD AND SYSTEM FOR DETERMINING A BACTERIAL RESISTANCE TO AN ANTIBIOTIC DRUG - The invention relates to a method, a databank, a system and a computer program product for determining a bacterial resistance to an antibiotic drug. A data bank is provided which comprises a plurality of bacterial reference nucleic acid sequences, wherein at least some reference nucleic acid sequences are associated with a respective antibiotic drug resistance information. A comparison unit is used for comparing said bacterial nucleic acid sequence information obtained from a sample with said plurality of bacterial reference nucleic acid sequences. | 06-09-2016 |
20160171151 | METHOD FOR DETERMINING READ ERROR IN NUCLEOTIDE SEQUENCE | 06-16-2016 |
20160171153 | Bioinformatics Systems, Apparatuses, And Methods Executed On An Integrated Circuit Processing Platform | 06-16-2016 |
20160188796 | METHODS OF CHARACTERIZING, DETERMINING SIMILARITY, PREDICTING CORRELATION BETWEEN AND REPRESENTING SEQUENCES AND SYSTEMS AND INDICATORS THEREFOR - A computer implemented method for characterizing one or more sequences by generating index values representing portions of the sequences and finding characterizing index values based on a comparison of the index values. The index values may be obtained by applying one or more mask over each sequence. The modified masks may have associated weightings and index values obtained using modified masks may be retained in the index only if the weightings are above a threshold value. Characterising index values may also be assessed for for their degree of uniqueness. Characterizing indexes may be used for predicting correlation between a sample sequence and one or more reference sequences. Biological monitoring systems utilising the characterizing index values are also disclosed. A biological indicator may be generatgenerated using one or more characterizing index values obtained by the above method and be used to produce an indicator that undergoes a property change in the presence of the one or more sequence. | 06-30-2016 |
20160203257 | SYSTEMS AND METHODS FOR IDENTIFYING STRUCTURALLY OR FUNCTIONALLY SIGNIFICANT AMINO ACID SEQUENCES | 07-14-2016 |
20160203258 | DNA SEQUENCING USING MOSFET TRANSISTORS | 07-14-2016 |
20160203261 | SYSTEMS AND METHODS FOR IDENTIFYING STRUCTURALLY OR FUNCTIONALLY SIGNIFICANT NUCLEOTIDE SEQUENCES | 07-14-2016 |
20160253453 | Parameterizing Cell-to-Cell Regulatory Heterogeneities via Stochastic Transcriptional Profiles | 09-01-2016 |
20160378915 | Systems and Methods for Multi-Scale, Annotation-Independent Detection of Functionally-Diverse Units of Recurrent Genomic Alteration - The functional interpretation of somatic mutations remains a persistent challenge in the interpretation of human genome data. Systems and methods for detecting significantly mutated regions (SMRs) in the human genome permit the discovery and identification of multi-scale cancer-driving mutational hotspot clusters. Systems and methods of SMR detection reveal differentially mutated genetic regions across various cancer types. SMR detection and annotation reveals a diverse spectrum of functional elements in the genome, including at least single amino acids, compete coding exons and protein domains, microRNAs, transcription factor binding sites, splice sites, and untranslated regions. Systems and methods of SMR detection optionally including protein structure mapping uncover recurrent somatic alterations within proteins. Systems and methods of SMR detection optionally including differential expression analysis reveal previously unappreciated connections between recurrent and somatic mutations and molecular signatures. | 12-29-2016 |
20170235874 | METHODS AND SYSTEMS FOR DETECTING MINOR VARIANTS IN A SAMPLE OF GENETIC MATERIAL | 08-17-2017 |
20190147983 | SYSTEMS AND METHODS FOR DE NOVO PEPTIDE SEQUENCING FROM DATA-INDEPENDENT ACQUISITION USING DEEP LEARNING | 05-16-2019 |