Estimating divergence functionals and the likelihood ratio by convex risk minimization (2008)
Nguyen, XuanLong, Wainwright, Martin J., Jordan, Michael I.
We develop and analyze $M$-estimation methods for divergence functionals and the likelihood ratios of two probability distributions. Our method is based on a non-asymptotic variational...
Union support recovery in high-dimensional multivariate regression (2008)
Obozinski, Guillaume, Wainwright, Martin J., Jordan, Michael I.
In the problem of multivariate regression, a K-dimensional response vector is regressed upon a common set of p covariates, with a p by K matrix B* of regression coefficients. We study the behavior of...
Consistent probabilistic outputs for protein function prediction (2008)
Obozinski, Guillaume, Lanckriet, Gert, Grant, Charles, Jordan, Michael I, Noble, William
Abstract In predicting hierarchical protein function annotations, such as terms in the Gene Ontology (GO), the simplest approach makes predictions for each term independently. However, this approach...
Peña-Castillo, Lourdes, Tasan, Murat, Myers, Chad L, Lee, Hyunju, Joshi, Trupti, Zhang, Chao, ...
Abstract Background: Several years after sequencing the human genome and the mouse genome, much remains to be discovered about the functions of most human and mouse genes. Computational prediction of...
The nested Chinese restaurant process and Bayesian inference of topic hierarchies (2007)
Blei, David M., Griffiths, Thomas L., Jordan, Michael I.
We present the nested Chinese restaurant process (nCRP), a stochastic process which assigns probability distributions to infinitely-deep, infinitely-branching trees. We show how this stochastic...
Comment on "Support Vector Machines with Applications" (2006)
Bartlett, Peter L., Jordan, Michael I., McAuliffe, Jon D.
Comment on "Support Vector Machines with Applications" [math.ST/0612817]
On optimal quantization rules for some problems in sequential decentralized detection (2006)
Nguyen, XuanLong, Wainwright, Martin J., Jordan, Michael I.
We consider the design of systems for sequential decentralized detection, a problem that entails several interdependent choices: the choice of a stopping rule (specifying the sample size), a global...
Convergence Results for the EM Approach to Mixtures of Experts Architectures (2006)
The Expectation-Maximization (EM) algorithm is an iterative approach to maximum likelihood parameter estimation. Jordan and Jacobs (1993) recently proposed an EM algorithm for the mixture of experts...
On the Convergence of Stochastic Iterative Dynamic Programming Algorithms (2006)
Jaakkola, Tommi, Jordan, Michael I., Singh, Satinder P.
Recent developments in the area of reinforcement learning have yielded a number of new algorithms for the prediction and control of Markovian environments. These algorithms, including the TD lambda)...
Strategic and Tactical Decision-Making Under Uncertainty (2006)
Jordan, Michael I., Anantharam, Venkat, El Ghaoui, Laurent, Russell, Stuart, Sastry, Shankar, Koller, Daphne, ...
This report presents the final conclusions of the research on decision-making under uncertainty conducted by the investigators at the University of California at Berkeley, Stanford University, and...
On divergences, surrogate loss functions, and decentralized detection (2005)
Nguyen, Xuanlong, Wainwright, Martin J., Jordan, Michael I.
We develop a general correspondence between a family of loss functions that act as surrogates to 0-1 loss, and the class of Ali-Silvey or $f$-divergence functionals. This correspondence provides the...
Protein Molecular Function Prediction by Bayesian Phylogenomics (2005)
Barbara E. Engelhardt, Michael I. Jordan, Kathryn E. Muratore, Steven E. Brenner
We present a statistical graphical model to infer specific molecular function for unannotated protein sequences using homology. Based on phylogenomic principles, SIFTER (Statistical Inference of...
Genome-Wide Requirements for Resistance to Functionally Distinct DNA-Damaging Agents (2005)
William Lee, Michael Proctor, Patrick Flaherty, Michael I. Jordan, Adam P. Arkin, ...
The mechanistic and therapeutic differences in the cellular response to DNA-damaging compounds are not completely understood, despite intense study. To expand our knowledge of DNA damage, we assayed...
The DLR Hierarchy of Approximate Inference (2005)
Rosen-Zvi, Michal, Jordan, Michael I., Yuille, Alan L
We propose a hierarchy for approximate inference based on the Dobrushin, Lanford, Ruelle (DLR) equations. This hierarchy includes existing algorithms, such as belief propagation, and also motivates...
Subtree power analysis finds optimal species for comparative genomics (2004)
McAuliffe, Jon D., Jordan, Michael I., Pachter, Lior
Sequence comparison across multiple organisms aids in the detection of regions under selection. However, resource limitations require a prioritization of genomes to be sequenced. This prioritization...
Probabilistic Independence Networks for Hidden Markov Probability Models (2004)
Smyth, Padhraic, Heckerman, Cavid, Jordan, Michael I
In this paper we explore hidden Markov models(HMMs) and related structures within the general framework of probabilistic independence networks (PINs). The paper contains a self-contained review of...
Bayesian Haplotype Inference via the Dirichlet Process (2004)
Eric Xing, Roded Sharan, Michael I. Jordan
The problem of inferring haplotypes from genotypes of single nucleotide polymorphisms (SNPs) is essential for the understanding of genetic variation within and among populations, with important...
A direct formulation for sparse PCA using (2004)
Laurent El Ghaoui, Michael I. Jordan
We examine the problem of approximating, in the Frobenius-norm sense, a positive, semidefinite symmetric matrix by a rank-one matrix, with an upper bound on the cardinality of its eigenvector. The...
A direct formulation for sparse PCA using semidefinite programming (2004)
D'Aspremont, Alexandre, Ghaoui, Laurent El, Jordan, Michael I., Lanckriet, Gert R. G.
We examine the problem of approximating, in the Frobenius-norm sense, a positive, semidefinite symmetric matrix by a rank-one matrix, with an upper bound on the cardinality of its eigenvector. The...
Logos: A Modular Bayesian Model For De Novo Motif Detection (2004)
Eric P. Xing, Wei Wu, Michael I. Jordan, Richard M. Karp
this paper, we present LOGOS,anintegratedLOcal and GlObal motif Sequence model for biopolymer sequences, which provides a principled framework for developing, modularizing, extending and computing...
Graph Partition Strategies for Generalized Mean Field Inference (2004)
Eric P. Xing, Michael I. Jordan, Stuart Russell
An autonomous variational inference algorithm for arbitrary graphical models requires the ability to optimize variational approximations over the space of model parameters as well as over the choice...
Distance Metric Learning, With Application (2004)
Eric P. Xing, Andrew Y. Ng, Michael I. Jordan, Stuart Russell
Many algorithms rely critically on being given a good metric over their inputs. For instance, data can often be clustered in many "plausible" ways, and if a clustering algorithm such as K-means...
Hierarchical Topic Models and the Nested Chinese Restaurant Process (2004)
David M. Blei, Thomas L. Griffiths, Michael I. Jordan, Joshua B. Tenenbaum
We address the problem of learning topic hierarchies from data. The model selection problem in this domain is daunting---which of the large collection of possible trees to use? We take a Bayesian...
Public Deployment of Cooperative Bug Isolation (2004)
Ben Liblit, Mayur Naik, Alice X. Zheng, Alex Aiken, Michael I. Jordan
As part of our work on Cooperative Bug Isolation (CBI) we have undertaken to instrument and distribute binaries for a number of large open source projects. This public deployment is an important step...
Bayesian Haplotype Inference via the Dirichlet Process (2004)
Eric Xing, Roded Sharan, Michael I. Jordan
The problem of inferring haplotypes from genotypes of single nucleotide polymorphisms (SNPs) is essential for the understanding of genetic variation within and among populations, with important...
Statistical Debugging in the Presence of Multiple Errors (2004)
Ben Liblit, Mayur Naik, Alice X. Zheng, Alex Aiken, Michael I. Jordan
We present a statistical debugging algorithm that operates on very sparsely sampled data drawn from large numbers of user runs. By identifying program behaviors that significantly increase the...
On the Concentration of Expectation and (2004)
We present an analysis of concentration-of-expectation phenomena in layered Bayesian networks that use generalized linear models as the local conditional probabilities. This framework encompasses a...
Failure Diagnosis Using Decision Trees (2004)
Mike Chen, Alice X. Zheng, Jim Lloyd, Michael I. Jordan, Eric Brewer
We present a decision tree learning approach to diagnosing failures in large Internet sites. We record runtime properties of each request and apply automated machine learning and data mining...
Hierarchical Dirichlet Processes (2004)
Yee Whye Teh, Matthew J. Beal, Michael I. Jordan, David M. Blei
We consider problems involving groups of data, where each observation within a group is a draw from a mixture model, and where it is desirable to share mixture components both within and between...
Learning Spectral Clustering (2004)
Francis R. Bach, Michael I. Jordan
Spectral clustering refers to a class of techniques which rely on the eigenstructure of a similarity matrix to partition points into disjoint clusters with points in the same cluster having high...
Kernel Dimensionality Reduction for Supervised (2004)
Kenji Fukumizu, Francis R. Bach, Michael I. Jordan
We propose a novel method of dimensionality reduction for supervised learning. Given a regression or classification problem in which we wish to predict a variable Y from an explanatory vector X , we...
Hierarchical Topic Models and (2004)
David M. Blei, Thomas L. Griffiths, Michael I. Jordan, Joshua B. Tenenbaum
We address the problem of learning topic hierarchies from data. The model selection problem in this domain is daunting -- which of the large collection of possible trees to use? We take a Bayesian...
Statistical Debugging of Sampled Programs (2004)
Alice X. Zheng, Michael I. Jordan, Ben Liblit, Alex Aiken
We present a novel strategy for automatically debugging programs given sampled data from thousands of actual user runs. Our goal is to pinpoint those features that are most correlated with crashes....
Autonomous Helicopter Flight (2004)
Andrew Y. Ng, H. Jin Kim, Michael I. Jordan, Shankar Sastry
Autonomous helicopter flight represents a challenging control problem, with complex, noisy, dynamics. In this paper, we describe a successful application of reinforcement learning to autonomous...
On the Concentration of Expectation and (2004)
We present an analysis of concentration-of-expectation phenomena in layered Bayesian networks that use generalized linear models as the local conditional probabilities. This framework encompasses a...
Semidefinite Relaxations for Approximate (2004)
Martin J. Wainwright, Michael I. Jordan
We present a new method for calculating approximate marginals for probability distributions defined by graphs with cycles, based on a Gaussian entropy bound combined with a semidefinite outer bound...
Statistical applications in fields such as bioinformatics, information retrieval, speech processing, image processing and communications often involve large-scale models in which thousands or...
Statistical Debugging of Sampled Programs (2004)
Alice X. Zheng, Michael I. Jordan, Ben Liblit, Alex Aiken
We present a novel strategy for automatically debugging programs given sampled data from thousands of actual user runs. Our goal is to pinpoint those features that are most correlated with crashes....
On the Concentration of Expectation and (2004)
We present an analysis of concentration-of-expectation phenomena in layered Bayesian networks that use generalized linear models as the local conditional probabilities. This framework encompasses a...
Bayesian Haplotype Inference via the Dirichlet (2003)
Eric P. Xing, Roded Sharan, Michael I. Jordan
The problem of inferring haplotypes from genotypes of single nucleotide polymorphisms (SNPs) is essential for the understanding of genetic variation within and among populations, with important...
Kernel Dimensionality Reduction for Supervised (2003)
Kenji Fukumizu, Francis R. Bach, Michael I. Jordan
We propose a novel method of dimensionality reduction for supervised learning. Given a regression or classification problem in which we wish to predict a variable Y from an explanatory vector X , we...
Kalman Filtering with Intermittent (2003)
Bruno Sinopoli, Luca Schenato, Massimo Franceschetti, Kameshwar Poolla, Michael I. Jordan, Shankar S. Sastry
Motivated by navigation and tracking applications within sensor networks, we consider the problem of performing Kalman filtering with intermittent observations. When data travel along unreliable...
Convexity, Classification, and Risk Bounds (2003)
Risk Bounds, Peter L. Bartlett, Michael I. Jordan, Jon D. Mcaulie
Many of the classification algorithms developed in the machine learning literature, including the support vector machine and boosting, can be viewed as minimum contrast methods that minimize a convex...
Kernel-based Integration of Genomic Data using Semidefinite Programming (2003)
Nello Cristianini, Michael I. Jordan, William Staord Noble
teins. The kernel representation is both flexible and e#cient, and provides a principled framework in which many types of data can be represented, including vectors, strings, trees and graphs....
Graph Partition Strategies for Generalized Mean Field Inference (2003)
Eric P. Xing, Michael I. Jordan
An autonomous variational inference algorithm for arbitrary graphical model requires the ability to optimize variational approximations over the space of model parameters as well as over the choice...
Modeling Annotated Data (2003)
David M. Blei, Michael I. Jordan
We consider the problem of modeling annotated data---data with multiple types where the instance of one type (such as a caption) serves as a description of the other type (such as an image). We...
Martin J. Wainwright, Michael I. Jordan
We present a new method for calculating approximate marginals for probability distributions defined by graphs with cycles, based on a Gaussian entropy bound combined with a semidefinite outer bound...
Modeling Annotated Data (2003)
David M. Blei, Michael I. Jordan
We consider the problem of modeling annotated data---data with multiple types where the instance of one type (such as a caption) serves as a description of the other type (such as an image). We...
Modeling Annotated Data (2003)
David M. Blei, Michael I. Jordan
We consider the problem of modeling annotated data---data with multiple types where the instance of one type (such as a caption) serves as a description of the other type (such as an image). We...
LOGOS: a modular Bayesian model for de novo motif detection (2003)
Eric P. Xing, Wei Wu, Michael I. Jordan, Richard M. Karp
The complexity of the global organization and internal structures of motifs in higher eukaryotic organisms raises significant challenges for motif detection techniques. To achieve successful de novo...
Eric P. Xing, Michael I. Jordan, Stuart Russell
We present a class of generalized mean field (GMF) algorithms for approximate inference in exponential family graphical models which is analogous to the generalized belief propagation (GBP) or...
Learning Graphical Models (2003)
Francis R. Bach, Michael I. Jordan
We present a class of algorithms for learning the structure of graphical models from data. The algorithms are based on a measure known as the kernel generalized variance (KGV), which essentially...
Distance Metric Learning, with Application to Clustering with Side-Information (2003)
Eric P. Xing, Andrew Y. Ng, Michael I. Jordan, Stuart Russell
Many algorithms rely critically on being given a good metric over their inputs. For instance, data can often be clustered in many "plausible" ways, and if a clustering algorithm such as K-means...
Modeling Annotated Data (2003)
David M. Blei, Michael I. Jordan
We consider the problem of modeling annotated data---data with multiple types where the instance of one type (such as a caption) serves as a description of the other type (such as an image). We...
Finding Clusters In Independent Component Analysis (2003)
Francis R. Bach, Michael I. Jordan
We present a class of algorithms that find clusters in independent component analysis: the data are linearly transformed so that the resulting components can be grouped into clusters, such that...
Kernel Independent Component Analysis (2003)
Francis R. Bach, Michael I. Jordan
We present a class of algorithms for independent component analysis (ICA) which use contrast functions based on canonical correlations in a reproducing kernel Hilbert space. On the one hand, we show...
Bug Isolation via Remote Program Sampling (2003)
Ben Liblit, Alex Aiken, Alice X. Zheng, Michael I. Jordan
We propose a low-overhead sampling infrastructure for gathering information from the executions experienced by a program 's user community. Several example applications illustrate ways to use sampled...
Latent Dirichlet Allocation (2003)
David M. Blei, Andrew Y. Ng, Michael I. Jordan, John Lafferty
We describe latent Dirichlet allocation (LDA), a generative probabilistic model for collections of discrete data such as text corpora. LDA is a three-level hierarchical Bayesian model, in which each...
Bug Isolation via Remote Program Sampling (2003)
Ben Liblit, Alex Aiken, Alice X. Zheng, Michael I. Jordan
We propose a low-overhead sampling infrastructure for gathering information from the executions experienced by a program 's user community. Several example applications illustrate ways to use sampled...
Convexity, Classification, and Risk Bounds (2003)
Risk Bounds, Peter L. Bartlett, Michael I. Jordan, Jon D. Mcaulie
Many of the classification algorithms developed in the machine learning literature, including the support vector machine and boosting, can be viewed as minimum contrast methods that minimize a convex...
Sampling User Executions for Bug Isolation (2003)
Ben Liblit, Alex Aiken, Alice X. Zheng, Michael I. Jordan
Introduction Many computer scientists think of a program as either correct (i.e. it meets some specification) or incorrect (i.e. it does not meet some specification). But industrial software...
Sampling User Executions for Bug Isolation (2003)
Ben Liblit, Alex Aiken, Alice X. Zheng, Michael I. Jordan
Introduction Many computer scientists think of a program as either correct (i.e. it meets some specification) or incorrect (i.e. it does not meet some specification). But industrial software...
To appear: Statistical Science, Special Issue on Bayesian Statistics. (2003)
this article our principal focus is on the presentation of graphical models that have proved useful in applied domains, and on ways in which the formalism encourages the exploration of extensions of...
A Minimal Intervention Principle for Coordinated Movement (2003)
Emanuel Todorov, Michael I. Jordan
Behavioral goals are achieved reliably and repeatedly with movements rarely reproducible in their detail. Here we offer an explanation: we show that not only are variability and goal achievement...
Semidefinite Relaxations for Approximate Inference on Graphs With Cycles (2003)
Martin J. Wainwright, Michael I. Jordan
We present a new method for calculating approximate marginals for probability distributions defined by graphs with cycles, based on a Gaussian entropy bound combined with a semidefinite outer bound...
Distance Metric Learning, With Application (2003)
Eric P. Xing, Andrew Y. Ng, Michael I. Jordan, Stuart Russell
Many algorithms rely critically on being given a good metric over their inputs. For instance, data can often be clustered in many "plausible" ways, and if a clustering algorithm such as K-means...
A Hierarchical Bayesian Markovian Model for Motifs in Biopolymer Sequences (2003)
Eric P. Xing, Michael I. Jordan, Richard M. Karp, Stuart Russell
We propose a dynamic Bayesian model for motifs in biopolymer sequences which captures rich biological prior knowledge and positional dependencies in motif structure in a principled way. Our model...
Computational Structure of Coordinate (2003)
Zoubin Ghahramani, Daniel M. Wolpert, Michael I. Jordan
One of the fundamental properties that both neural networks and the central nervous system share is the ability to learn and generalize from examples. While this property has been studied extensively...
David M. Blei, Andrew Y. Ng, Michael I. Jordan, John Lafferty
We describe latent Dirichlet allocation (LDA), a generative probabilistic model for collections of discrete data such as text corpora. LDA is a three-level hierarchical Bayesian model, in which each...
Hierarchical Bayesian Models for Applications in Information retrieval (2003)
J. M. Bernardo, M. J. Bayarri, J. O. Berger, A. P. Dawid, D. Heckerman, ...
this article, we find that most of the factors are very close to # while four of the factors achieve significant expected counts. Looking at the distribution over words, z), for those four factors,...
Distance Metric Learning, With Application (2003)
Eric P. Xing, Andrew Y. Ng, Michael I. Jordan, Stuart Russell
Many algorithms rely critically on being given a good metric over their inputs. For instance, data can often be clustered in many "plausible" ways, and if a clustering algorithm such as K-means...
A Hierarchical Bayesian Markovian Model for (2003)
Eric P. Xing, Michael I. Jordan, Richard M. Karp, Stuart Russell
We propose a dynamic Bayesian model for motifs in biopolymer sequences which captures rich biological prior knowledge and positional dependencies in motif structure in a principled way. Our model...
Learning Graphical Models (2003)
Francis R. Bach, Michael I. Jordan
We present a class of algorithms for learning the structure of graphical models from data. The algorithms are based on a measure known as the kernel generalized variance (KGV), which essentially...
Robust Novelty Detection with (2003)
Laurent El Ghaoui, Michael I. Jordan
In this paper we consider the problem of novelty detection, presenting an algorithm that aims to nd a minimal region in input space containing a fraction of the probability mass underlying a data...
Analyse en composantes indépendantes et réseaux bayésiens (2003)
BACH, Francis R., JORDAN, Michael I.
- Une généralisation de l'analyse en composantes indépendantes (ACI) est introduite: au lieu de déterminer une application linéaire qui rend les composantes indépendantes, nous cherchons une...
A Robust Minimax Approach to Classification (2002)
Laurent E Ghaoui, Chiranjib Bhattacharyya, Michael I. Jordan
When constructing a classifier, the probability of correct classification of future data points should be maximized. We consider a binary classification problem where the mean and covariance matrix...
Tree-dependent Component Analysis (2002)
Francis R. Bach, Michael I. Jordan
We present a generalization of independent component analysis (ICA), where instead of looking for a linear transform that makes the data components independent, we look for a transform that makes the...
Learning the Kernel Matrix (2002)
Gert Lanckriet, Nello Cristianini, Peter Bartlett, Laurent El Ghaoui, Michael I. Jordan
Kernel-based learning algorithms work by embedding the data into a Euclidean space, and then searching for linear relations among the embedded data points. The embedding is performed implicitly, by...