Bayesian biclustering of gene expression data (2008)
Abstract Background Biclustering of gene expression data searches for local patterns of gene expression. A bicluster (or a two-way cluster) is defined as a set of genes whose expression profiles are...
Extracting Sequence Features to Predict Protein-DNA Interactions: A Comparative Study (2008)
Predicting how and where proteins, especially transcription factors (TFs), interact with DNA is an important problem in biology. We present here a systematic study of predictive modeling approaches...
Genomic Sequence Is Highly Predictive of Local Nucleosome Depletion (2008)
The regulation of DNA accessibility through nucleosome positioning is important for transcription control. Computational models have been developed to predict genome-wide nucleosome positions from...
Bayesian Functional Data Clustering for Temporal Microarray Data (2008)
Ping Ma, Wenxuan Zhong, Yang Feng, Jun S. Liu
We propose a Bayesian procedure to cluster temporal gene expression microarray profiles, based on a mixed-effect smoothing-spline model, and design a Gibbs sampler to sample from the desired...
Genomic sequence is highly predictive of local nucleosome depletion (2007)
The regulation of DNA accessibility through nucleosome positioning is important for transcription control. Computation models have been developed to predict genome-wide nucleosome positions from DNA...
Predicting Gene Expression from Sequence: A Reexamination (2007)
Yuan Yuan, Lei Guo, Lei Shen, Jun S. Liu
Although much of the information regarding genes' expressions is encoded in the genome, deciphering such information has been very challenging. We reexamined Beer and Tavazoie's (BT) approach to...
Statistical power of phylo-HMM for evolutionarily conserved element detection (2007)
Fan, Xiaodan, Zhu, Jun, Schadt, Eric E, Liu, Jun S
Abstract Background An important goal of comparative genomics is the identification of functional elements through conservation analysis. Phylo-HMM was recently introduced to detect conserved...
Model-based analysis of two-color arrays (MA2C) (2007)
Song, Jun S, Johnson, W Evan, Zhu, Xiaopeng, Zhang, Xinmin, Li, Wei, Manrai, Arjun K, ...
Abstract A novel normalization method based on the GC content of probes is developed for two-color tiling arrays. The proposed method, together with robust estimates of the model parameters, is shown...
On Side-Chain Conformational Entropy of Proteins (2006)
The role of side-chain entropy (SCE) in protein folding has long been speculated about but is still not fully understood. Utilizing a newly developed Monte Carlo method, we conducted a systematic...
Discussion of "Equi-energy sampler" by Kou, Zhou and Wong (2006)
We congratulate Samuel Kou, Qing Zhou and Wing Wong [math.ST/0507080] (referred to subsequently as KZW) for this beautifully written paper, which opens a new direction in Monte Carlo computation....
Bayesian Clustering of Transcription Factor Binding Motifs (2006)
Genes are often regulated in living cells by proteins called transcription factors (TFs) that bind directly to short segments of DNA in close proximity to specific genes. These binding sites have a...
Yuan, Guo-Cheng, Ma, Ping, Zhong, Wenxuan, Liu, Jun S
Abstract Background Histone acetylation plays important but incompletely understood roles in gene regulation. A comprehensive understanding of the regulatory role of histone acetylation is difficult...
Discussion of “Equi-energy sampler” by Kou, Zhou and Wong (2006)
We congratulate Samuel Kou, Qing Zhou and Wing Wong (referred to subsequently as KZW) for this beautifully written paper, which opens a new direction in Monte Carlo computation. This discussion has...
Bayesian models for pooling microarray studies with multiple sources of replications (2006)
Conlon, Erin M, Song, Joon J, Liu, Jun S
Abstract Background Biologists often conduct multiple but different cDNA microarray studies that all target the same biological system or pathway. Within each study, replicate slides within repeated...
Zhang, Xuegong, Lu, Xin, Shi, Qian, Xu, Xiu-qin, Leung, Hon-chiu E, Harris, Lyndsay N, ...
Abstract Background Like microarray-based investigations, high-throughput proteomics techniques require machine learning algorithms to identify biomarkers that are informative for biological...
Abstract Background Certain protein families are highly conserved across distantly related organisms and belong to large and functionally diverse superfamilies. The patterns of conservation present...
Patrick Eichenberger, Masaya Fujita, Shane T. Jensen, Erin M. Conlon, David Z. Rudner, Stephanie T. Wang, ...
A comprehensive genomic analysis of sporulation in Bacillus subtilis reveals a coordinated program of gene activation and repression, which involves 383 genes.
Patrick Eichenberger, Masaya Fujita, Shane T. Jensen, Erin M. Conlon, David Z. Rudner, Stephanie T. Wang, ...
Asymmetric division during sporulation by Bacillus subtilis generates a mother cell that undergoes a 5-h program of differentiation. The program is governed by a hierarchical cascade consisting of...
Sequential Monte Carlo Methods for Statistical Analysis of (2004)
Yuguo Chen, Persi Diaconis, Susan P. Holmes, Jun S. Liu
We describe a sequential importance sampling (SIS) procedure for analyzing two-way zero-one or contingency tables with fixed marginal sums. An essential feature of the new method is that it samples...
Clustering analysis of SAGE data using a Poisson approach (2004)
Cai, Li, Huang, Haiyan, Blackshaw, Seth, Liu, Jun S, Cepko, Connie, Wong, Wing H
Abstract Serial analysis of gene expression (SAGE) data have been poorly exploited by clustering analysis owing to the lack of appropriate statistical methods that consider their specific properties....
Computational Discovery of Gene Regulatory Binding Motifs: A Bayesian Perspective (2004)
Jensen, Shane T., Liu, X. Shirley, Zhou, Qing, Liu, Jun S.
The Bayesian approach together with Markov chain Monte Carlo techniques has provided an attractive solution to many important bioinformatics problems such as multiple sequence alignment, microarray...
Modeling within-motif dependence for transcription factor binding site predictions (2004)
Motivation: The position-specific weight matrix (PWM) model, which assumes that each position in the DNA site contributes independently to the overall protein-DNA interaction, has been the primary...
Sequential Importance Sampling with Resampling in Molecular Population Genetics (2003)
Motivated by the statistical inference problem in population genetics, we present a general sequential importance sampling (SIS) and resampling strategy for solving linear and integral equations and...
Exploring Hybrid Monte Carlo in Bayesian Computation (2001)
Hybrid Monte Carlo (HMC) has been successfully applied to molecular simulation problems since its introduction in the late 1980s. Its use in Bayesian computation, however, is relatively recent and...
Bayesian Protein Structure Prediction (2001)
Scott C. Schmidler, Jun S. Liu, Douglas L. Brutlag
An important role for statisticians in the age of the Human Genome Project has developed in the emerging area of "structural bioinformatics ". Sequence analysis and structure prediction for...
Bayesian Analysis of Haplotypes for Linkage Disequilibrium Mapping (2000)
Jun S. Liu, Chiara Sabatti, Jun Teng
Haplotype analysis of disease chromosomes can help identify probable historical recombination events and, consequently, localize a disease gene. Most available analyses use only marginal and pairwise...
Adaptive Joint Detection and Decoding in Flat-Fading Channels via Mixture Kalman Filtering (2000)
Rong Chen, Xiaodong Wang, Jun S. Liu
A novel adaptive Bayesian receiver for signal detection and decoding in fading channels with known channel statistics is developed; it is based on the sequential Monte Carlo methodology that recently...
A Theory for Dynamic Weighting in Monte Carlo Computation (2000)
Jun S. Liu, Faming Liang, Wing Hung Wong
This article provides a rst theoretical analysis on a new Monte Carlo approach, the dynamic weighting algorithm, proposed recently by Wong and Liang. In dynamic weighting Monte Carlo, one augments...
Monte Carlo Bayesian Signal Processing for Wireless Communications (2000)
Xiaodong Wang, Rong Chen, Jun S. Liu
Many statistical signal processing problems found in wireless communications involves making inference about the transmitted information data based on the received signals, in the presence of various...
this paper, a special Metropolis-Hastings type algorithm, Metropolized independent sampling, proposed firstly in Hastings (1970), is studied in full detail. The eigenvalues and eigenvectors of the...
A Theoretical Framework for Sequential Importance Sampling and Resampling (1999)
Jun S. Liu, Rong Chen, Tanya Logvinenko
Sequential importance sampling (SIS) was first developed in 1950s for molecular simulation. Although half a century has passed by, the SIS methodology remains one of the most versatile and powerful...
Dynamic Weighting In Markov Chain Monte Carlo (1999)
Jun S. Liu, Faming Liang, Wing Hung Wong
This article provides a first theoretical analysis on a new Monte Carlo approach, the dynamic weighting, proposed recently by Wong and Liang. In dynamic weighting, one augments the original state...
Parameter Expansion for Data Augmentation (1999)
Viewing the observed data of a statistical model as incomplete and augmenting its missing parts are useful for clarifying concepts and central to the invention of two well-known statistical...
Monte Carlo EM with importance reweighting and its applications in random effects models (1999)
Fernando A. Quintana, Jun S. Liu
In this paper we propose a new Monte Carlo EM algorithm to compute maximum likelihood estimates in the context of random e#ects models. The algorithm involves the construction of e#cient sampling...
Markov Chain Monte Carlo and Related Topics (1999)
This article provides a brief review of recent developments in Markov chain Monte Carlo methodology. The methods discussed include the standard Metropolis-Hastings algorithm, the Gibbs sampler, and...
In treating dynamic systems, sequential Monte Carlo methods use discrete samples to represent a complicated probability distribution and use rejection sampling, importance sampling, and weighted...
Relaxed Simulated Tempering for VLSI Floorplan Designs (1999)
Jason Cong, Tianming Kong, Dongmin Xu, Faming Liang, Jun S. Liu, Wing Hung Wong
In the past two decades, the simulated annealing technique has been considered as a powerful approach to handle many NP-hard optimization problems in VLSI designs. Recently, a new Monte Carlo and...
Relaxed Simulated Tempering for VLSI Floorplan Designs (1999)
Jason Cong, Tianming Kong, Dongmin Xu, Faming Liang, Jun S. Liu, Wing Hung Wong
In the past two decades, the simulated annealing technique has been considered as a powerful approach to handle many NP-hard optimization problems in VLSI designs. Recently, a new Monte Carlo and...
Relaxed Simulated Tempering for VLSI Floorplan Designs (1999)
Jason Cong, Tianming Kong, Dongmin Xu, Faming Liang, Jun S. Liu, Wing Hung Wong
In the past two decades, the simulated annealing technique has been considered as a powerful approach to handle many NP-hard optimization problems in VLSI designs. Recently, a new Monte Carlo and...
Relaxed Simulated Tempering for VLSI Floorplan Designs (1998)
Jason Cong, Tianming Kong, Dongmin Xu, Faming Liang, Jun S. Liu, Wing Hung Wong
In the past two decades, the simulated annealing technique has been considered as a powerful approach to handle many NP-hard optimization problems in VLSI designs. Recently, a new Monte Carlo and...
Markovian Structures in Biological Sequence Alignments (1998)
Jun S. Liu, Andrew F. Neuwald, Charles E. Lawrence
this article, we provide a coherent view of the two recent models used for multiple sequence alignment --- the hidden Markov model (HMM) and the block-based motif model --- in order to develop a set...
Parameter Expansion for Data Augmentation (1998)
Viewing the observed data of a statistical model as incomplete and augmenting its missing parts is useful for clarifying concepts (such as in causal inference) and is central to the invention of two...
Simulated Sintering: Markov Chain Monte Carlo With Spaces of Varying Dimensions (1998)
this article, we explore possible generalizations of the tempering procedure along the line proposed by Wong (1995). More precisely, we consider the construction of the distribution family Pi with...
Sequential Monte Carlo Methods for Dynamic Systems (1998)
A general framework for using Monte Carlo methods in dynamic systems is provided and its wide applications indicated. Under this framework, several currently available techniques are studied and...
The properties of the cross-match estimate and split sampling (1997)
Kong, Augustine, Liu, Jun S., Wong, Wing Hung
By noting the connection with k-sample U-statistics, we find a simple decomposition of the variance of the cross-match estimate, which can be regarded as a generalization of Efron and Stein. We apply...
Sequential Importance Sampling for Nonparametric Bayes Models: The Next Generation (1997)
Steven N. Maceachern, Merlise Clyde, Jun S. Liu
this paper, we exploit the similarities between the Gibbs sampler and the SIS, bringing over the improvements for Gibbs sampling algorithms to the SIS setting for nonparametric Bayes problems. These...
Sequential Importance Sampling for Nonparametric Bayes Models: The Next Generation (1997)
Steven N. Maceachern, Merlise Clyde, Jun S. Liu
this paper, we exploit the similarities between the Gibbs sampler and the SIS, bringing over the improvements for Gibbs sampling algorithms to the SIS setting for nonparametric Bayes problems. These...
Monte Carlo EM with Importance Reweighting and Its Applications in Random Effects Models (1997)
O A. Quintana, Fernando A. Quintana, Jun S. Liu
In this paper we propose a new Monte Carlo EM algorithm to compute maximum likelihood estimates in the context of random effects models. The algorithm involves the construction of efficient sampling...
Sequential Importance Sampling for Nonparametric Bayes Models: The Next Generation (1996)
Steven N. Maceachern, Merlise Clyde, Jun S. Liu
this paper, we exploit the similarities between the Gibbs sampler and the SIS, bringing over the improvements for Gibbs sampling algorithms to the SIS setting for nonparametric Bayes problems. These...
Nonparametric hierarchical Bayes via sequential imputations (1996)
We consider the empirical Bayes estimation of a distribution using binary data via the Dirichlet process. Let $\mathscr{D}(\alpha)$ denote a Dirichlet process with $\alpha$ being a finite measure on...
Phylogenetic footprinting of transcription factor binding sites in proteobacterial genomes
McCue, Lee Ann, Thompson, William, Carmack, C. Steven, Ryan, Michael P., Liu, Jun S., Derbyshire, Victoria, ...
Toward the goal of identifying complete sets of transcription factor (TF)-binding sites in the genomes of several gamma proteobacteria, and hence describing their transcription regulatory networks,...
BALSA: Bayesian algorithm for local sequence alignment
Webb, Bobbie-Jo M., Liu, Jun S., Lawrence, Charles E.
The Smith–Waterman algorithm yields a single alignment, which, albeit optimal, can be strongly affected by the choice of the scoring matrix and the gap penalties. Additionally, the scores obtained...
Methylation of histone H3 Lys 4 in coding regions of active genes
Bernstein, Bradley E., Humphrey, Emily L., Erlich, Rachel L., Schneider, Robert, Bouman, Peter, Liu, Jun S., ...
Posttranslational modifications of histone tails regulate chromatin structure and transcription. Here we present global analyses of histone acetylation and histone H3 Lys 4 methylation patterns in...
Integrating regulatory motif discovery and genome-wide expression analysis
Conlon, Erin M., Liu, X. Shirley, Lieb, Jason D., Liu, Jun S.
We propose motif regressor for discovering sequence motifs upstream of genes that undergo expression changes in a given condition. The method combines the advantages of matrix-based motif finding and...
Bayesian Analysis of Haplotypes for Linkage Disequilibrium Mapping
Liu, Jun S., Sabatti, Chiara, Teng, Jun, Keats, Bronya J.B., Risch, Neil
Haplotype analysis of disease chromosomes can help identify probable historical recombination events and localize disease mutations. Most available analyses use only marginal and pairwise allele...
Statistical resynchronization and Bayesian detection of periodically expressed genes
Lu, Xin, Zhang, Wen, Qin, Zhaohui S., Kwast, Kurt E., Liu, Jun S.
We propose a periodic–normal mixture (PNM) model to fit transcription profiles of periodically expressed (PE) genes in cell cycle microarray experiments. The model leads to a principled statistical...
Haplotype Information and Linkage Disequilibrium Mapping for Single Nucleotide Polymorphisms
Lu, Xin, Niu, Tianhua, Liu, Jun S.
Single nucleotide polymorphisms in the human genome have become an increasingly popular topic in that their analyses promise to be a key step toward personalized medicine. We investigate two related...
Neuwald, Andrew F., Kannan, Natarajan, Poleksic, Aleksandar, Hata, Naoya, Liu, Jun S.
Proteins comprising the core of the eukaryotic cellular machinery are often highly conserved, presumably due to selective constraints maintaining important structural features. We have developed...
A suite of web-based programs to search for transcriptional regulatory motifs
Liu, Yueyi, Wei, Liping, Batzoglou, Serafim, Brutlag, Douglas L., Liu, Jun S., Liu, X. Shirley
The identification of regulatory motifs is important for the study of gene expression. Here we present a suite of programs that we have developed to search for regulatory sequence motifs: (i)...
Clustering analysis of SAGE data using a Poisson approach
Cai, Li, Huang, Haiyan, Blackshaw, Seth, Liu, Jun S, Cepko, Connie, Wong, Wing H
Two Poisson-based distances were developed for SAGE data; their application to simulated and experimental mouse retina data show that they are more appropriate and reliable for analyzing SAGE data...
Zhang, Kui, Qin, Zhaohui S., Liu, Jun S., Chen, Ting, Waterman, Michael S., Sun, Fengzhu
Recent studies have revealed that linkage disequilibrium (LD) patterns vary across the human genome with some regions of high LD interspersed by regions of low LD. A small fraction of SNPs (tag SNPs)...
Eichenberger, Patrick, Fujita, Masaya, Jensen, Shane T, Conlon, Erin M, Rudner, David Z, Wang, Stephanie T, ...
Asymmetric division during sporulation by Bacillus subtilis generates a mother cell that undergoes a 5-h program of differentiation. The program is governed by a hierarchical cascade consisting of...
Decoding Human Regulatory Circuits
Thompson, William, Palumbo, Michael J., Wasserman, Wyeth W., Liu, Jun S., Lawrence, Charles E.
Clusters of transcription factor binding sites (TFBSs) which direct gene expression constitute cis-regulatory modules (CRMs). We present a novel algorithm, based on Gibbs sampling, which locates, de...
De novo cis-regulatory module elicitation for eukaryotic genomes
Transcription regulation is controlled by coordinated binding of one or more transcription factors in the promoter regions of genes. In many species, especially higher eukaryotes, transcription...
A data-driven clustering method for time course gene expression data
Ma, Ping, Castillo-Davis, Cristian I., Zhong, Wenxuan, Liu, Jun S.
Gene expression over time is, biologically, a continuous process and can thus be represented by a continuous function, i.e. a curve. Individual genes often share similar expression patterns...
Recursive SVM feature selection and sample classification for mass-spectrometry and microarray data
Zhang, Xuegong, Lu, Xin, Shi, Qian, Xu, Xiu-qin, Leung, Hon-chiu E, Harris, Lyndsay N, ...
Phylogenetic footprinting of transcription factor binding sites in proteobacterial genomes
McCue, Lee Ann, Thompson, William, Carmack, C. Steven, Ryan, Michael P., Liu, Jun S., Derbyshire, Victoria, ...
Toward the goal of identifying complete sets of transcription factor (TF)-binding sites in the genomes of several gamma proteobacteria, and hence describing their transcription regulatory networks,...