Inna Dubchak

The amphioxus genome and the evolution of the chordate karyotype (2008)

Putnam, Nicholas H., Butts, Thomas, Ferrier, David E. K., Furlong, Rebecca F., Hellsten, Uffe, Kawashima, Takeshi, ...

Lancelets ('amphioxus') are the modern survivors of an ancient chordate lineage, with a fossil record dating back to the Cambrian period. Here we describe the structure and gene content of the highly...

Multiple Whole Genome Alignments and Novel Biomedical Applications at the VISTA Portal (2008)

Brudno, Michael, Poliakov, Alexander, Minovitsky, Simon, Ratnere, Igor, Dubchak, Inna

The VISTA portal for comparative genomics is designed to give biomedical scientists a unified set of tools to lead them from the raw DNA sequences through the alignment and annotation to the...

Short sequence motifs, overrepresented in mammalian conserved non-coding sequences (2007)

Minovitsky, Simon, Stegmaier, Philip, Kel, Alexander, Kondrashov, Alexey S, Dubchak, Inna

Abstract Background A substantial fraction of non-coding DNA sequences of multicellular eukaryotes is under selective constraint. In particular, ~5% of the human genome consists of conserved...

Extensive parallelism in protein evolution (2007)

Bazykin, Georgii A, Kondrashov, Fyodor A, Brudno, Michael, Poliakov, Alexander, Dubchak, Inna, Kondrashov, Alexey S

Abstract Background Independently evolving lineages mostly accumulate different changes, which leads to their gradual divergence. However, parallel accumulation of identical changes is also common,...

TreeQ-VISTA: An Interactive Tree Visualization Tool with Functional Annotation Query Capabilities (2007)

Gu, Shengyin, Anderson, Iain, Kunin, Victor, Cipriano, Michael, Minovitsky, Simon, Weber, Gunther, ...

Summary: We describe a general multiplatform exploratory tool called TreeQ-Vista, designed for presenting functional annotations in a phylogenetic context. Traits, such as phenotypic and genomic...

Short sequence motifs, overrepresented in mammalian conserved non-coding sequences (2007)

Minovitsky, Simon, Stegmaier, Philip, Kel, Alexander, Kondrashov, Alexey S., Dubchak, Inna

Background: A substantial fraction of non-coding DNA sequences of multicellular eukaryotes is under selective constraint. In particular, ~;5 percent of the human genome consists of conserved...

VISTA Enhancer Browser--A Database of Tissue-Specific Human Enhancers (2006)

Visel, Axel, Minovitsky, Simon, Dubchak, Inna, Pennacchio, Len A.

Despite the known existence of distant-acting cis-regulatory elements in the human genome, only a small fraction of these elements has been identified and experimentally characterized in vivo. This...

RegTransBase - A Database Of Regulatory Sequences and Interactions in a Wide Range of Prokaryotic Genomes (2006)

Kazakov, Alexei E., Cipriano, Michael J., Novichkov, Pavel S., Minovitsky, Simon, Vinogradov, Dmitry V., Arkin, Adam, ...

RegTransBase, a manually curated database of regulatory interactions in prokaryotes, captures the knowledge in published scientific literature using a controlled vocabulary. Although a number of...

In Vivo Enhancer Analysis Chromosome 16 Conserved Noncoding Sequences (2006)

Pennacchio, Len A., Ahituv, Nadav, Moses, Alan M., Nobrega, Marcelo, Prabhakar, Shyam, Shoukry, Malak, ...

The identification of enhancers with predicted specificities in vertebrate genomes remains a significant challenge that is hampered by a lack of experimentally validated training sets. In this study,...

Variation in conserved non-coding sequences on chromosome 5q and susceptibility to asthma and atopy (2005)

Donfack, Joseph, Schneider, Daniel H, Tan, Zheng, Kurz, Thorsten, Dubchak, Inna, Frazer, Kelly A, ...

Abstract Background Evolutionarily conserved sequences likely have biological function. Methods To determine whether variation in conserved sequences in non-coding DNA contributes to risk for human...

Conservation patterns in different functional sequence categories of divergent Drosophila species (2005)

Papatsenko, Dmitri, Kislyuk, Andrey, Levine, Michael, Dubchak, Inna

We have explored the distributions of fully conserved ungapped blocks in genome-wide pairwise alignments of recently completed species of Drosophila: D.yakuba, D.ananassae, D.pseudoobscura, D.virilis...

Variation in conserved non-coding sequences on chromosome 5q and susceptibility to asthma and atopy (2005)

Donfack, Joseph, Schneider, Daniel H., Tan, Zheng, Kurz, Thorsten, Dubchak, Inna, Frazer, Kelly A., ...

Background: Evolutionarily conserved sequences likely have biological function. Methods: To determine whether variation in conserved sequences in non-coding DNA contributes to risk for human disease,...

Reconstruction of regulatory and metabolic pathways in metal-reducing δ-proteobacteria (2004)

Rodionov, Dmitry A, Dubchak, Inna, Arkin, Adam, Alm, Eric, Gelfand, Mikhail S

Abstract Background Relatively little is known about the genetic basis for the unique physiology of metal-reducing genera in the delta subgroup of the proteobacteria. The recent availability of...

Reconstruction Of Regulatory And Metabolic Pathways In Metal-Reducing delta-Proteobacteria (2004)

Rodionov, Dmitry A., Dubchak, Inna, Arkin, Adam, Alm, Eric, Gelfand, Mikhail S.

Relatively little is known about the genetic basis for the unique physiology of metal-reducing genera in the delta subgroup of the proteobacteria. The recent availability of complete finished or...

Integrative Annotation of 21,037 Human Genes Validated by Full-Length cDNA Clones (2004)

Tadashi Imanishi, Takeshi Itoh, Yutaka Suzuki, Claire O'Donovan, Satoshi Fukuchi, Kanako O. Koyanagi, ...

An international team has systematically validated and annotated just over 21,000 human genes using full-length cDNA, thereby providing a valuable new resource for the human genetics community.

Integrative Annotation of 21,037 Human Genes Validated by Full-Length cDNA Clones (2004)

Tadashi Imanishi, Takeshi Itoh, Yutaka Suzuki, Claire O'Donovan, Satoshi Fukuchi, Kanako O. Koyanagi, ...

The human genome sequence defines our inherent biological potential; the realization of the biology encoded therein requires knowledge of the function of each gene. Currently, our knowledge in this...

Phylo-VISTA: An Interactive Visualization Tool for Multiple DNA Sequence Alignments (2004)

Shah, Nameeta, Couronne, Olivier, Pennacchio, Len A., Brudno, Michael, Batzoglou, Serafim, Bethel, E. Wes, ...

We have developed Phylo-VISTA (Shah et al., 2003), an interactive software tool for analyzing multiple alignments by visualizing a similarity measure for DNA sequences of multiple species. The...

Comparative genome sequencing of drosophila pseudoobscura: Chromosomal, gene and cis-element evolution (2004)

Richards, Stephen, Liu, Yue, Bettencourt, Brian R., Hradecky, Pavel, Letovsky, Stan, Nielsen, Rasmus, ...

The genome sequence of a second fruit fly, D. pseudoobscura, presents an opportunity for comparative analysis of a primary model organism D. melanogaster. The vast majority of Drosophila genes have...

VISTA - computational tools for comparative genomics (2004)

Frazer, Kelly A., Pachter, Lior, Poliakov, Alexander, Rubin, Edward M., Dubchak, Inna

Comparison of DNA sequences from different species is a fundamental method for identifying functional elements in genomes. Here we describe the VISTA family of tools created to assist biologists in...

Multi-species sequence comparison: the next frontier in genome annotation (2003)

Dubchak, Inna, Frazer, Kelly

Abstract Multi-species comparisons of DNA sequences are more powerful for discovering functional sequences than pairwise DNA sequence comparisons. Most current computational tools have been designed...

Automatic Discovery of Sub-molecular Sequence Domains in Multi-aligned Sequences: A Dynamic Programming Algorithm for Multiple Alignment Segmentation (2003)

Eric Poe Xing, Denise M. Wolf, Inna Dubchak, Sylvia Spengler, Manfred Zorn, Ilya Muchnik

this paper was obtained from Ribosomal Database Project (RDP, release 7.0) (Maidak et al., 1999) by choosing a subset of 417 sequences out of the complete multi-alignment of 2055 eukaryotic small...

Phylo-VISTA: An interactive visualization tool for multiple DNA sequence alignments (2003)

Shah, Nameeta, Couronne, Olivier, Pennacchio, Len A., Brudno, Michael, Batzoglou, Serafim, Bethel, E. Wes, ...

Motivation. The power of multi-sequence comparison for biological discovery is well established and sequence data from a growing list of organisms is becoming available. Thus, a need exists for...

Strategies and tools for whole genome alignments (2002)

Couronne, Olivier, Poliakov, Alexander, Bray, Nicolas, Ishkhanov, Tigran, Ryaboy, Dmitriy, Rubin, Edward, ...

The availability of the assembled mouse genome makes possible, for the first time, an alignment and comparison of two large vertebrate genomes. We have investigated different strategies of alignment...

Analysis of ribosomal RNA sequences by combinatorial clustering (2002)

Poe Xing, Casimir Kulikowski, Ilya Muchnik, Inna Dubchak, Denise M Wolf, Manfred Zorn

We present an analysis of multi-aligned eukaryotic and procaryotic small subunit rRNA sequences using a novel segmentation and clustering procedure capable of extracting subsets of sequences that...

Bioinformatics (2001)

Inna Dubchak

Motivation: Protein fold recognition is an important approach to structure discovery without relying on sequence similarity. We study this approach with new multi-class classification methods and...

Analysis of ribosomal RNA sequences by combinatorial clustering (2001)

Poe Xing, Casimir Kulikowski, Ilya Muchnik, Inna Dubchak, Denise M Wolf, Manfred Zorn

We present an analysis of multi-aligned eukaryotic and procaryotic small subunit rRNA sequences using a novel segmentation and clustering procedure capable of extracting subsets of sequences that...

Relation between Protein Structure, Sequence Homology and Composition of Amino Acids (1995)

Eddy N. Mayoraz, Inna Dubchak, Ilya Muchnik

A method of quantitative comparison of two classifications rules applied to protein folding problem is presented. Classification of proteins based on sequence homology and based on amino acid...

Relation between Protein Structure, Sequence Homology and Composition of Amino Acids (1995)

Eddy Mayoraz, Inna Dubchak, Ilya Muchnik

. A method of quantitative comparison of two classifications rules applied to protein folding problem is presented. Classification of proteins based on sequence homology and based on amino acid...

Computational analysis of candidate intron regulatory elements for tissue-specific alternative pre-mRNA splicing

Brudno, Michael, Gelfand, Mikhail S., Spengler, Sylvia, Zorn, Manfred, Dubchak, Inna, Conboy, John G.

Alternative pre-mRNA splicing is a major cellular process by which functionally diverse proteins can be generated from the primary transcript of a single gene, often in tissue-specific patterns. The...

A computational approach to identify genes for functional RNAs in genomic sequences

Carter, Richard J., Dubchak, Inna, Holbrook, Stephen R.

Currently there is no successful computational approach for identification of genes encoding novel functional RNAs (fRNAs) in genomic sequences. We have developed a machine learning approach using...

rVista for Comparative Sequence-Based Discovery of Functional Transcription Factor Binding Sites

Loots, Gabriela G., Ovcharenko, Ivan, Pachter, Lior, Dubchak, Inna, Rubin, Edward M.

Identifying transcriptional regulatory elements represents a significant challenge in annotating the genomes of higher vertebrates. We have developed a computational tool, rVISTA, for high-throughput...

Active Conservation of Noncoding Sequences Revealed by Three-Way Species Comparisons

Dubchak, Inna, Brudno, Michael, Loots, Gabriela G., Pachter, Lior, Mayor, Chris, Rubin, Edward M., ...

Human and mouse genomic sequence comparisons are being increasingly used to search for evolutionarily conserved gene regulatory elements. Large-scale human–mouse DNA comparison studies have...

Multi-species sequence comparison: the next frontier in genome annotation

Dubchak, Inna, Frazer, Kelly

Most current computational tools have been designed for pairwise comparisons of DNA sequences, and efficient extension of these tools to multiple species will require knowledge of the ideal...

Characterization of Evolutionary Rates and Constraints in Three Mammalian Genomes

Cooper, Gregory M., Brudno, Michael, Stone, Eric A., Dubchak, Inna, Batzoglou, Serafim, Sidow, Arend

We present an analysis of rates and patterns of microevolutionary phenomena that have shaped the human, mouse, and rat genomes since their last common ancestor. We find evidence for a shift in the...

Automated Whole-Genome Multiple Alignment of Rat, Mouse, and Human

Brudno, Michael, Poliakov, Alexander, Salamov, Asaf, Cooper, Gregory M., Sidow, Arend, Rubin, Edward M., ...

We have built a whole-genome multiple alignment of the three currently available mammalian genomes using a fully automated pipeline that combines the local/global approach of the Berkeley Genome...

Integrative Annotation of 21,037 Human Genes Validated by Full-Length cDNA Clones

Imanishi, Tadashi, Itoh, Takeshi, Suzuki, Yutaka, O'Donovan, Claire, Fukuchi, Satoshi, Koyanagi, Kanako O, ...

The human genome sequence defines our inherent biological potential; the realization of the biology encoded therein requires knowledge of the function of each gene. Currently, our knowledge in this...

Strategies and Tools for Whole-Genome Alignments

Couronne, Olivier, Poliakov, Alexander, Bray, Nicolas, Ishkhanov, Tigran, Ryaboy, Dmitriy, Rubin, Edward, ...

The availability of the assembled mouse genome makes possible, for the first time, an alignment and comparison of two large vertebrate genomes. We investigated different strategies of alignment for...

AVID: A Global Alignment Program

Bray, Nick, Dubchak, Inna, Pachter, Lior

In this paper we describe a new global alignment method called AVID. The method is designed to be fast, memory efficient, and practical for sequence alignments of large genomic regions up to...

Cross-Species Sequence Comparisons: A Review of Methods and Available Resources

Frazer, Kelly A., Elnitski, Laura, Church, Deanna M., Dubchak, Inna, Hardison, Ross C.

With the availability of whole-genome sequences for an increasing number of species, we are now faced with the challenge of decoding the information contained within these DNA sequences. Comparative...

VISTA: computational tools for comparative genomics

Frazer, Kelly A., Pachter, Lior, Poliakov, Alexander, Rubin, Edward M., Dubchak, Inna

Comparison of DNA sequences from different species is a fundamental method for identifying functional elements in genomes. Here, we describe the VISTA family of tools created to assist biologists in...

Reconstruction of regulatory and metabolic pathways in metal-reducing δ-proteobacteria

Rodionov, Dmitry A, Dubchak, Inna, Arkin, Adam, Alm, Eric, Gelfand, Mikhail S

A study of the genetic and regulatory factors in several biosynthesis, metal ion homeostasis, stress response, and energy metabolism pathways suggests that phylogenetically diverse δ-proteobacteria...

The splicing regulatory element, UGCAUG, is phylogenetically and spatially conserved in introns that flank tissue-specific alternative exons

Minovitsky, Simon, Gee, Sherry L., Schokrpur, Shiruyeh, Dubchak, Inna, Conboy, John G.

Previous studies have identified UGCAUG as an intron splicing enhancer that is frequently located adjacent to tissue-specific alternative exons in the human genome. Here, we show that UGCAUG is...

Gene expression patterns define key transcriptional events in cell-cycle regulation by cAMP and protein kinase A

Zambon, Alexander C., Zhang, Lingzhi, Minovitsky, Simon, Kanter, Joan R., Prabhakar, Shyam, Salomonis, Nathan, ...

Although a substantial number of hormones and drugs increase cellular cAMP levels, the global impact of cAMP and its major effector mechanism, protein kinase A (PKA), on gene expression is not known....

The integrated microbial genomes (IMG) system

Markowitz, Victor M., Korzeniewski, Frank, Palaniappan, Krishna, Szeto, Ernest, Werner, Greg, Padki, Anu, ...

The integrated microbial genomes (IMG) system is a new data management and analysis platform for microbial genomes provided by the Joint Genome Institute (JGI). IMG contains both draft and complete...

Comparative genome sequencing of Drosophila pseudoobscura: Chromosomal, gene, and cis-element evolution

Richards, Stephen, Liu, Yue, Bettencourt, Brian R., Hradecky, Pavel, Letovsky, Stan, Nielsen, Rasmus, ...

We have sequenced the genome of a second Drosophila species, Drosophila pseudoobscura, and compared this to the genome sequence of Drosophila melanogaster, a primary model organism. Throughout...

Computational analysis of candidate intron regulatory elements for tissue-specific alternative pre-mRNA splicing

Brudno, Michael, Gelfand, Mikhail S., Spengler, Sylvia, Zorn, Manfred, Dubchak, Inna, Conboy, John G.

Alternative pre-mRNA splicing is a major cellular process by which functionally diverse proteins can be generated from the primary transcript of a single gene, often in tissue-specific patterns. The...

A computational approach to identify genes for functional RNAs in genomic sequences

Carter, Richard J., Dubchak, Inna, Holbrook, Stephen R.

Currently there is no successful computational approach for identification of genes encoding novel functional RNAs (fRNAs) in genomic sequences. We have developed a machine learning approach using...

rVista for Comparative Sequence-Based Discovery of Functional Transcription Factor Binding Sites

Loots, Gabriela G., Ovcharenko, Ivan, Pachter, Lior, Dubchak, Inna, Rubin, Edward M.

Identifying transcriptional regulatory elements represents a significant challenge in annotating the genomes of higher vertebrates. We have developed a computational tool, rVISTA, for high-throughput...

Active Conservation of Noncoding Sequences Revealed by Three-Way Species Comparisons

Dubchak, Inna, Brudno, Michael, Loots, Gabriela G., Pachter, Lior, Mayor, Chris, Rubin, Edward M., ...

Human and mouse genomic sequence comparisons are being increasingly used to search for evolutionarily conserved gene regulatory elements. Large-scale human–mouse DNA comparison studies have...

Multi-species sequence comparison: the next frontier in genome annotation

Dubchak, Inna, Frazer, Kelly

Most current computational tools have been designed for pairwise comparisons of DNA sequences, and efficient extension of these tools to multiple species will require knowledge of the ideal...

Characterization of Evolutionary Rates and Constraints in Three Mammalian Genomes

Cooper, Gregory M., Brudno, Michael, Stone, Eric A., Dubchak, Inna, Batzoglou, Serafim, Sidow, Arend

We present an analysis of rates and patterns of microevolutionary phenomena that have shaped the human, mouse, and rat genomes since their last common ancestor. We find evidence for a shift in the...

Automated Whole-Genome Multiple Alignment of Rat, Mouse, and Human

Brudno, Michael, Poliakov, Alexander, Salamov, Asaf, Cooper, Gregory M., Sidow, Arend, Rubin, Edward M., ...

We have built a whole-genome multiple alignment of the three currently available mammalian genomes using a fully automated pipeline that combines the local/global approach of the Berkeley Genome...

Integrative Annotation of 21,037 Human Genes Validated by Full-Length cDNA Clones

Imanishi, Tadashi, Itoh, Takeshi, Suzuki, Yutaka, O'Donovan, Claire, Fukuchi, Satoshi, Koyanagi, Kanako O, ...

The human genome sequence defines our inherent biological potential; the realization of the biology encoded therein requires knowledge of the function of each gene. Currently, our knowledge in this...

Strategies and Tools for Whole-Genome Alignments

Couronne, Olivier, Poliakov, Alexander, Bray, Nicolas, Ishkhanov, Tigran, Ryaboy, Dmitriy, Rubin, Edward, ...

The availability of the assembled mouse genome makes possible, for the first time, an alignment and comparison of two large vertebrate genomes. We investigated different strategies of alignment for...

AVID: A Global Alignment Program

Bray, Nick, Dubchak, Inna, Pachter, Lior

In this paper we describe a new global alignment method called AVID. The method is designed to be fast, memory efficient, and practical for sequence alignments of large genomic regions up to...