David Haussler

Publication List Details

Period

1982 - 2007

Number

211

Co-Authors

Comparative Genomics Search for Losses of Long-Established Genes on the Human Lineage (2007)

Jingchun Zhu, J. Zachary Sanborn, Mark Diekhans, Craig B. Lowe, Tom H. Pringle, David Haussler

Taking advantage of the complete genome sequences of several mammals, we developed a novel method to detect losses of well-established genes in the human genome through syntenic mapping of gene...

Detecting Coevolution in and among Protein Domains (2007)

Chen-Hsiang Yeang, David Haussler

Correlated changes of nucleic or amino acids have provided strong information about the structures and interactions of molecules. Despite the rich literature in coevolutionary sequence analysis,...

Comparative Genomics Search for Losses of Long-Established Genes on the Human Lineage (2007)

Jingchun Zhu, J. Zachary Sanborn, Mark Diekhans, Craig B Lowe, Tom Pringle, David Haussler

It is intuitive to think that changes leading to increased complexity, adaptation, and intelligence are achieved by the gain and improvement of genetic components such as genes and regulatory...

Detecting the Coevolution in and among Protein Domains (2007)

Chen-Hsiang Yeang, David Haussler

Correlated changes of nucleic or amino acids have provided strong information in the structures and interactions of molecules. Despite the rich literature in coevolutionary sequence analysis,...

Forces Shaping the Fastest Evolving Regions in the Human Genome (2006)

Katherine S. Pollard, Sofie R. Salama, Bryan King, Andrew D. Kern, Tim Dreszer, Sol Katzman, ...

Comparative genomics allow us to search the human genome for segments that were extensively changed in the last ~5 million years since divergence from our common ancestor with chimpanzee, but are...

Forces Shaping the Fastest Evolving Regions in the Human Genome (2006)

Katherine S. Pollard, Sofie R. Salama, Bryan King, Andrew Kern, Tim Dreszer, Sol Katzman, ...

Comparative genomics allows us to search the human genome for segments that were extensively changed in the last ~5 million years since divergence from our common ancestor with chimpanzee, but are...

Identification and classification of conserved RNA secondary structures in the human genome (2006)

Pedersen, Jakob Skou, Bejerano, Gill, Siepel, Adam, Rosenbloom, Kate, Lindblad-Toh, Kerstin, Lander, Eric S, ...

The discoveries of microRNAs and riboswitches, among others, have shown functional RNAs to be biologically more important and genomically more prevalent than previously anticipated. We have developed...

Identification and Classification of Conserved RNA Secondary Structures in the Human Genome (2006)

Jakob Skou Pedersen, Gill Bejerano, Adam Siepel, Kate Rosenbloom, Kerstin Lindblad-Toh, Eric S. Lander, ...

The discoveries of microRNAs and riboswitches, among others, have shown functional RNAs to be biologically more important and genomically more prevalent than previously anticipated. We have developed...

Identification and Classification of Conserved RNA Secondary Structures in the Human Genome (2006)

Jakob Skou Pedersen, Gill Bejerano, Adam Siepel, Kate R. Rosenbloom, Kerstin Lindblad-Toh, Eric S Lander, ...

The discovery of, e.g., microRNAs and riboswitches have shown functional RNAs to be biologically more important and genomically more prevalent than previously anticipated. We have developed a general...

Detecting the dependent evolution of biosequences (2006)

Darot, Jeremy, Yeang, Chen-Hsiang H, Haussler, David

A probabilistic graphical model is developed in order to detect the dependent evolution between different sites in biological sequences. Given a multiple sequence alignment for each molecule of...

Unusual Intron Conservation near Tissue-Regulated Exons Found by Splicing Microarrays (2006)

Charles W. Sugnet, Karpagam Srinivasan, Tyson A. Clark, Georgeann O'Brien, Melissa S. Cline, Hui Wang, ...

Alternative splicing contributes to both gene regulation and protein diversity. To discover broad relationships between regulation of alternative splicing and sequence conservation, we applied a...

Unusual Intron Conservation Near Tissue-regulated Exons Found by Splicing Microarrays (2005)

Charles W Sugnet, Karpagam Srinivasan, Tyson A Clark, Georgeann O'Brien, Melissa S Cline, Hui Wang, ...

Alternative splicing contributes to both gene regulation and protein diversity. To discover broad relationships between regulation of alternative splicing and sequence conservation, we applied a...

An Enhancer Near ISL1 and an Ultraconserved Exon of PCBP2 are Derived from a Retroposon (2005)

Bejerano, Gill, Lowe, Craig, Ahituv, Nadav, King, Bryan, Siepel, Adam, Salama, Sofie, ...

Hundreds of highly conserved distal cis-regulatory elements have been characterized to date in vertebrate genomes1. Many thousands more are predicted based on comparative genomics2,3. Yet, in stark...

The structure of a rigorously conserved RNA element within the SARS virus genome (2005)

Robertson, Michael P, Igel, Haller, Baertsch, Robert, Haussler, David, Ares, Manuel, Scott, William G

We have solved the three- dimensional crystal structure of the stem- loop II motif ( s2m) RNA element of the SARS virus genome to 2.7- Angstrom resolution. SARS and related coronaviruses and...

The Structure of a Rigorously Conserved RNA Element within the SARS Virus Genome (2005)

Michael P. Robertson, Haller Igel, Robert Baertsch, David Haussler, Manuel Ares, William G. Scott

The SARS RNA genome contains a unique structure that resembles a portion of ribosomal RNA; this may allow the virus to hijack its hosts protein synthesis machinery.

The Structure of a Rigorously Conserved RNA Element within the SARS Virus Genome (2005)

Michael P. Robertson, Haller Igel, Robert Baertsch, David Haussler, Manuel Ares Jr., William G. Scott

We have solved the three-dimensional crystal structure of the stem-loop II motif (s2m) RNA element of the SARS virus genome to 2.7-Å resolution. SARS and related coronaviruses and astroviruses all...

Reconstructing large regions of an ancestral mammalian genome in silico (2004)

Blanchette, M, Green, E D, Miller, W, Haussler, David

It is believed that most modern mammalian lineages arose from a series of rapid speciation events near the Cretaceous-Tertiary boundary. It is shown that such a phylogeny makes the common ancestral...

Hotspots of mammalian chromosomal evolution (2004)

Bailey, Jeffrey A, Baertsch, Robert, Kent, W, Haussler, David, Eichler, Evan E

Abstract Background Chromosomal evolution is thought to occur through a random process of breakage and rearrangement that leads to karyotype differences and disruption of gene order. With the...

Hotspots of mammalian chromosomal evolution. (2004)

Bailey, Jeffrey A, Baertsch, Robert, Kent, W James, Haussler, David, Eichler, Evan E

BACKGROUND: Chromosomal evolution is thought to occur through a random process of breakage and rearrangement that leads to karyotype differences and disruption of gene order. With the availability of...

Classifying G-protein coupled receptors with support vector machines (2001)

Rachel Karchin, Kevin Karplus, David Haussler

Motivation: The enormous amount of protein sequence data uncovered by genome research has increased the demand for computer software that can automate the recognition of new proteins. We discuss the...

Selective Sampling Using the Query by Committee Algorithm (2001)

Yoav Freund, Eli Shamir, David Haussler

. We analyze the "query by committee" algorithm, a method for filtering informative queries from a random stream of inputs. We show that if the two-member committee algorithm achieves information...

Scale-sensitive Dimensions, Uniform Convergence, and Learnability (2001)

Noga Alon, Shai Ben-david, David Haussler

Learnability in Valiant's PAC learning model has been shown to be strongly related to the existence of uniform laws of large numbers. These laws define a distribution-free convergence property of...

Characterizations of Learnability for Classes of {0, . . . , n}-valued Functions (2001)

Shai Ben-david, David Haussler, Philip M. Long

We investigate the PAC learnability of classes of f0; : : : ; ng-valued functions (n ! 1). For n = 1 it is known that the finiteness of the Vapnik-Chervonenkis dimension is necessary and sufficient...

Using the Fisher kernel method to detect remote protein homologies (2001)

Tommi Jaakkola, Mark Diekhans, David Haussler

A new method, called the Fisher kernel method, for detecting remote protein homologies is introduced and shown to perform well in classifying protein domains by SCOP superfamily. The method is a...

A Discriminative Framework for Detecting Remote Protein Homologies (2001)

Tommi Jaakkola, Mark Diekhans, David Haussler

A new method for detecting remote protein homologies is introduced and shown to perform well in classifying protein domains by SCOP superfamily. The method is a variant of support vector machines...

Convolution Kernels on Discrete Structures UCSC-CRL-99-10 (2001)

David Haussler

We introduce a new method of constructing kernels on sets whose elements are discrete structures like strings, trees and graphs. The method can be applied iteratively to build a kernel on a infinite...

A General Lower Bound on the Number (2001)

Andrzej Ehrenfeucht, David Haussler, Michael Kearns

We prove a lower bound of OmegaGamma ffl ) on the number of random examples required for distribution-free learning of a concept class C, where VCdim(C) is the Vapnik-Chervonenkis dimension and ffl...

Support Vector Machine Classification and Validation of Cancer Tissue Samples Using Microarray Expression Data (2001)

Terrence S. Furey, Nello Cristianini, David W. Bednarski, David Haussler

Motivation: DNA microarray experiments generating thousands of gene expression measurements are being used to gather information from tissue and cell samples about gene expression dierences that will...

Support Vector Machine Classification of Microarray Gene Expression Data (2000)

William Noble Grundy, David Lin, Nello Cristianini, Charles Sugnet, David Haussler

We introduce a new method of functionally classifying genes using gene expression data from DNA microarray hybridization experiments. The method is based on the theory of support vector machines...

Dirichlet Mixtures: A Method for Improving Detection of Weak but Significant Protein Sequence Homology (2000)

Kimmen Sjolander, Kevin Karplus, Michael Brown, Richard Hughey, Anders Krogh, I. Saira Mian, ...

This paper presents the mathematical foundations of Dirichlet mixtures, which have been used to improve database search results for homologous sequences, when a variable number of sequences from a...

Support Vector Machine Classification and Validation of Cancer Tissue Samples Using Microarray Expression Data (2000)

Terrence S. Furey, Nello Cristianini, David W. Bednarski, David Haussler

DNA microarray experiments generating thousands of gene expression measurements are being used to gather information from tissue and cell samples about gene expression dierences that will be useful...

Knowledge-based Analysis of Microarray Gene Expression Data By Using Support Vector Machines (2000)

William Noble Grundy, David Lin, Nello Cristianini, Charles Walsh Sugnet, Terrence S. Furey, ...

We introduce a method of functionally classifying genes by using gene expression data from DNA microarray hybridization experiments. The method is based on the theory of support vector machines...

KDD for Science Data Analysis: Issues and Examples (2000)

Usama Fayyad, David Haussler, Paul Stolorz

The analysis of the massive data sets collected by scientific instruments demands automation as a pre-requisite to analysis. There is an urgent need to create an intermediate level at which...

A Discriminative Framework for Detecting Remote Protein Homologies (2000)

Tommi Jaakkola, Mark Diekhans, David Haussler

A new method for detecting remote protein homologies is introduced and shown to perform well in classifying protein domains by SCOP superfamily. The method is a variant of support vector machines...

A Hidden Markov Model that finds genes in (2000)

Anders Krogh, I. Saira Mian, David Haussler, Em Algorithm

A hidden Markov model (HMM) has been developed to find protein coding genes in E. coli DNA using E. coli genome DNA sequence from the EcoSeq6 database maintained by Kenn Rudd. This HMM includes...

Convolution Kernels on Discrete Structures (1999)

David Haussler

We introduce a new method of constructing kernels on sets whose elements are discrete structures like strings, trees and graphs. The method can be applied iteratively to build a kernel on a infinite...

Support Vector Machine Classification of Microarray Gene Expression Data (1999)

William Noble Grundy, David Lin, Nello Cristianini, Charles Sugnet, David Haussler

We introduce a new method of functionally classifying genes using gene expression data from DNA microarray hybridization experiments. The method is based on the theory of support vector machines...

Support Vector Machine Classification of Microarray Gene Expression Data UCSC-CRL-99-09 (1999)

William Noble Grundy, David Lin, Nello Cristianini, Charles Sugnet, David Haussler

We introduce a new method of functionally classifying genes using gene expression data from DNA microarray hybridization experiments. The method is based on the theory of support vector machines...

How to Use Expert Advice (1999)

Yoav Freund, David Haussler, David P. Helmbold, Robert E. Schapire, Manfred K. Warmuth

We analyze algorithms that predict a binary value by combining the predictions of several prediction strategies, called experts. Our analysis is for worst-case situations, i.e., we make no...

Support Vector Machine Classification of Microarray Gene Expression Data UCSC-CRL-99-09 (1999)

William Noble Grundy, David Lin, Nello Cristianini, Charles Sugnet, David Haussler

We introduce a new method of functionally classifying genes using gene expression data from DNA microarray hybridization experiments. The method is based on the theory of support vector machines...

A Discriminative Framework for Detecting Remote Protein Homologies (1999)

Tommi Jaakkola, Mark Diekhans, David Haussler

A new method for detecting remote protein homologies is introduced and shown to perform well in classifying protein domains by SCOP superfamily. The method is a variant of support vector machines...

Support Vector Machine Classification of Microarray Gene Expression Data (1999)

William Noble Grundy, David Lin, Nello Cristianini, Charles Sugnet, David Haussler

We introduce a new method of functionally classifying genes using gene expression data from DNA microarray hybridization experiments. The method is based on the theory of support vector machines...

Unsupervised Learning of Distributions on Binary Vectors Using Two Layer Networks (1999)

Yoav Freund, David Haussler

this paper is related to both of these lines of work and has some advantages over each of them. If we find a good model of the distribution, we can tackle other interesting learning problems, such as...

Dirichlet Mixtures: A Method for Improving Detection of Weak but Significant Protein Sequence Homology (1999)

Kimmen Sjolander, Kevin Karplus, Michael Brown, Richard Hughey, Anders Krogh, I. Saira Mian, ...

This paper presents the mathematical foundations of Dirichlet mixtures, which have been used to improve database search results for homologous sequences, when a variable number of sequences from a...

Part 1: Overview of the Probably Approximately Correct (PAC) Learning Framework (1999)

David Haussler

this paper we will assume that L is bounded and nonnegative, i.e. 0 L M for some real M . When Y and A are finite it is always possible to enforce this condition by simply adding a constant to L,...

Dirichlet Mixtures: A Method for Improving Detection of Weak but Significant Protein Sequence Homology (1999)

Kimmen Sjolander, Kevin Karplus, Michael Brown, Richard Hughey, Anders Krogh, I. Saira Mian, ...

This paper presents the mathematical foundations of Dirichlet mixtures, which have been used to improve database search results for homologous sequences, when a variable number of sequences from a...

Exploiting Generative Models in Discriminative Classifiers (1999)

Tommi S. Jaakkola, David Haussler

Generative probability models such as hidden Markov models provide a principled way of treating missing information and dealing with variable length sequences. On the other hand, discriminative...

Studies in Probabilistic Sequence Alignment and Evolution (1998)

Ian Holmes, Ewan Birney, Bill Bruno, Richard Durbin, Sean Eddy, David Haussler, ...

The complete sequencing of whole genomes presents opportunities for detailed study of molecular evolution. This thesis combines theoretical developments of Bayesian approaches in bioinformatics with...

A Discriminative Framework for Detecting Remote Protein Homologies (1998)

Tommi Jaakkola, Mark Diekhans, David Haussler

A new method for detecting remote protein homologies is introduced and shown to perform well in classifying protein domains by SCOP superfamily. The method is a variant of support vector machines...

A Discriminative Framework for Detecting Remote Protein Homologies (1998)

Tommi Jaakkola, Mark Diekhans, David Haussler

A new method for detecting remote protein homologies is introduced and shown to perform well in classifying protein domains by SCOP superfamily. The method is a variant of support vector machines...

Exploiting Generative Models in Discriminative Classifiers (1998)

Tommi S. Jaakkola, David Haussler

Generative probability models such as hidden Markov models provide a principled way of treating missing information and dealing with variable length sequences. On the other hand, discriminative...

Exploiting Generative Models in Discriminative Classifiers (1998)

Tommi S. Jaakkola, David Haussler

Generative probability models such as hidden Markov models provide a principled way of treating missing information and dealing with variable length sequences. On the other hand, discriminative...

Martin G. Reese (1998)

Martin G. Reese, Frank H. Eeckman, Human Genome, Informatics Group, David Kulp, David Haussler

We present an improved splice site predictor for the genefinding program Genie. Genie is based on a generalized Hidden Markov Model (GHMM) that describes the grammar of a legal parse of a multi-exon...

David Kulp, David Haussler (1998)

David Kulp, David Haussler, Martin G. Reese, Frank H. Eeckman

This paper expands on this work and introduces three new additions: an improved probability estimation scheme that combines evidence from multiple sources, a general sensor class to interpret...

A Generalized Hidden Markov Model for the Recognition of Human Genes in DNA (1998)

David Kulp, David Haussler, Martin G. Reese, Frank H. Eeckman

We present a statistical model of genes in DNA. A Generalized Hidden Markov Model (GHMM) provides the framework for describing the grammar of a legal parse of a DNA sequence (Stormo & Haussler 1994)....

Probabilistic Kernel Regression Models (1998)

Tommi S. Jaakkola, David Haussler

We introduce a class of flexible conditional probability models and techniques for classification /regression problems. Many existing methods such as generalized linear models and support vector...

Probabilistic Kernel Regression Models (1998)

Tommi S. Jaakkola, David Haussler

We introduce a class of flexible conditional probability models and techniques for classification /regression problems. Many existing methods such as generalized linear models and support vector...

Probabilistic Kernel Regression Models (1998)

Tommi S. Jaakkola, David Haussler

We introduce a class of flexible conditional probability models and techniques for classification /regression problems. Many existing methods such as generalized linear models and support vector...

Exploiting Generative Models in Discriminative Classifiers (1998)

Tommi S. Jaakkola, David Haussler

Generative probability models such as hidden Markov models provide a principled way of treating missing information and dealing with variable length sequences. On the other hand, discriminative...

Exploiting Generative Models in Discriminative Classifiers (1998)

Tommi S. Jaakkola, David Haussler

Generative probability models such as hidden Markov models provide a principled way of treating missing information and dealing with variable length sequences. On the other hand, discriminative...

Efficient Learning Algorithms. (1998)

Haussler, David, Warmuth, Manfred K.

We have worked on refining and generalizing the PAC learning model introduced by Valiant. Measures of performance for learning algorithms that we have examined include computational complexity,...

Rigorous Learning Curve Bounds from Statistical Mechanics (1998)

P Rba, David Haussler, Michael Kearns

. In this paper we introduce and investigate a mathematically rigorous theory of learning curves that is based on ideas from statistical mechanics. The advantage of our theory over the...

Computational Genefinding (1998)

David Haussler

Introduction Computational methodology for finding genes and other functional sites in genomic DNA has evolved significantly over the last 20 years. Excellent recent surveys have been given by...

Computational Genefinding (1998)

David Haussler

Introduction Computational methodology for finding genes and other functional sites in genomic DNA has evolved significantly over the last 20 years. Excellent recent surveys have been given by Guig'o...

Dirichlet Mixtures: A Method for Improving Detection of Weak but Significant Protein Sequence Homology (1998)

Kimmen Sjolander, Kevin Karplus, Michael Brown, Richard Hughey, Anders Krogh, I. Saira Mian, ...

This paper presents the mathematical foundations of Dirichlet mixtures, which have been used to improve database search results for homologous sequences, when a variable number of sequences from a...

Exploiting Generative Models in Discriminative Classifiers (1998)

Tommi S. Jaakkola, David Haussler

Generative probability models such as hidden Markov models provide a principled way of treating missing information and dealing with variable length sequences. On the other hand, discriminative...

Predicting protein structure using hidden Markov models (1998)

Kevin Karplus, Christian Barrett, Melissa Cline, David Haussler, Richard Hughey, Liisa Holm, ...

We discuss how methods based on hidden Markov models performed in the fold-recognition section of the CASP2 experiment. Hidden Markov models were built for a representative set of just over one...

Sequential Prediction of Individual Sequences Under General Loss Functions (1998)

David Haussler, Jyrki Kivinen, Manfred K. Warmuth

We consider adaptive sequential prediction of arbitrary binary sequences when the performance is evaluated using a general loss function. The goal is to predict on each individual sequence nearly as...

Sequential Prediction of Individual Sequences Under General Loss Functions (1998)

David Haussler, Jyrki Kivinen, Manfred K. Warmuth

We consider adaptive sequential prediction of arbitrary binary sequences when the performance is evaluated using a general loss function. The goal is to predict on each individual sequence nearly as...

Bounds on the Sample Complexity of Bayesian Learning Using Information Theory and the VC Dimension (1998)

David Haussler, Michael Kearns, Robert Schapire

In this paper we study a Bayesian or average-case model of concept learning with a twofold goal: to provide more precise characterizations of learning curve (sample complexity) behavior that depend...

How to Use Expert Advice (1998)

Yoav Freund, David Haussler, David P. Helmbold, Robert E. Schapire, Manfred K. Warmuth

We analyze algorithms that predict a binary value by combining the predictions of several prediction strategies, called experts. Our analysis is for worst-case situations, i.e., we make no...