A Spectral Algorithm for Learning Hidden Markov Models (2008)
Hsu, Daniel, Kakade, Sham M., Zhang, Tong
Hidden Markov Models (HMMs) are one of the most fundamental and widely used statistical tools for modeling discrete time series. Typically, they are learned using search heuristics (such as the...
Sparse Online Learning via Truncated Gradient (2008)
Langford, John, Li, Lihong, Zhang, Tong
We propose a general method called truncated gradient to induce sparsity in the weights of online learning algorithms with convex loss functions. This method has several essential properties: The...
CONTROLLED ELEVATOR SAFETY MECHANISM (2008)
Moss, James, Rayle, Michael, Shek, Marco, Zhang, Tong
ME450 Capstone Design and Manufacturing Experience: Winter 2008
An experiment study of lung ischemia-reperfusion injury of pulmonary surgery in rabbit model (2008)
Jian CHEN, Lin XU, Feng JIANG, Zhonghai DING, Min YANG, Tong ZHANG
Background and objective The blocking of pulmonary vessels, including the blocking of pulmonaryartery and pulmonary circulation, is always applied in the surgical treatment of locally advanced...
Zhang, Tong, Zhuang, Shunhui, Casteel, Darren E, Pilz, Renate B
No abstract available.
We consider an extension of $\epsilon$-entropy to a KL-divergence based complexity measure for randomized density estimation methods. Based on this extension, we develop a general...
HowtogetaChineseName(Entity): Segmentation and Combination Issues (2007)
Jing, Hongyan, Florian, Radu, Luo, Xiaoqiang, Zhang, Tong, Ittycheriah, Abraham
When building a Chinese named entity recognition system, one must deal with certain language-specific issues such as whether the model should be based on characters or words. While there is no unique...
From ɛ-entropy to KL-entropy: Analysis of minimum information complexity density estimation (2006)
We consider an extension of ɛ-entropy to a KL-divergence based complexity measure for randomized density estimation methods. Based on this extension, we develop a general information-theoretical...
Boosting with early stopping: Convergence and consistency (2005)
Boosting is one of the most significant advances in machine learning for classification and regression. In its original and computationally flexible version, boosting seeks to minimize empirically a...
Boosting with early stopping: Convergence and consistency (2005)
Boosting is one of the most significant advances in machine learning for classification and regression. In its original and computationally flexible version, boosting seeks to minimize empirically a...
Effects of county-of-origin labeling (COOL) in the United States meat industry (2005)
The study investigates the effects of Country-of-Origin Labeling (COOL) in the U.S. meat industry. Theoretical analysis is provided about how domestic producers' derived demand would change due to...
Solving Large Scale Linear Prediction Problems Using Stochastic (2004)
Linear prediction methods, such as least squares for regression, logistic regression and support vector machines for classi cation, have been extensively used in statistics and machine learning. In...
Greedy Algorithms for Classification - Consistency, Convergence Rates, and Adaptivity (2004)
Shie Mannor, Ron Meir, Tong Zhang, Yoram Singer
Many regression and classification algorithms proposed over the years can be described as greedy procedures for the stagewise minimization of an appropriate cost function. Some examples include...
Boosting with Early Stopping: Convergence and Consistency (2004)
Boosting is one of the most significant advances in machine learning for classification and regression. In its original and computationally flexible version, boosting seeks to minimize empirically a...
An Infinity-sample Theory for Multi-category (2004)
The purpose of this paper is to investigate infinity-sample properties of risk minimization based multi-category classification methods. These methods can be considered as natural extensions to...
Learning Bounds for a Generalized Family of Bayesian Posterior Distributions (2004)
In this paper we obtain convergence bounds for the concentration of Bayesian posterior distributions (around the true distribution) using a novel method that simplifies and enhances previous results....
We study how closely the optimal Bayes error rate can be approximately reached using a classification algorithm that computes a classifier by minimizing a convex upper bound of the classification...
Discussions of boosting papers, and rejoinders (2004)
Bartlett, Peter L., Bickel, Peter J., Bühlmann, Peter, Freund, Yoav, Friedman, Jerome, Hastie, Trevor, ...
Discussions of: "Process consistency for AdaBoost" [Ann. Statist. 32 (2004), no. 1, 13-29] by W. Jiang; "On the Bayes-risk consistency of regularized boosting methods" [ibid., 30-55] by G. Lugosi and...
A Robust Risk Minimization based Named Entity Recognition System (2004)
This paper describes a robust linear classification system for Named Entity Recognition. A similar system has been applied to the CoNLL text chunking shared task with state of the art performance. By...
Named Entity Recognition through Classifier Combination (2004)
Radu Florian, Abe Ittycheriah, Hongyan Jing, Tong Zhang
This paper presents a classifier-combination experimental framework for named entity recognition in which four diverse classifiers (robust linear classifier, maximum entropy, transformation-based...
Updating an NLP System to Fit New Domains: an empirical study on the (2004)
Tong Zhang, Fred Damerau, David Johnson
Statistical machine learning algorithms have been successfully applied to many natural language processing (NLP) problems. Compared to manually constructed systems, statistical NLP systems are often...
Thesis (doctoral)--Chemnitz, 2004.
Ron Meir, Tong Zhang, Thore Graepel, Ralf Herbrich
Bayesian approaches to learning and estimation have played a significant role in the Statistics literature over many years. While they are often provably optimal in a frequentist setting, and lead to...
Boosting with Early Stopping: Convergence and Consistency (2003)
Boosting is one of the most significant advances in machine learning for classification and regression. In its original and computationally flexible version, boosting seeks to minimize empirically a...
A Robust Risk Minimization based Named Entity Recognition System (2003)
This paper describes a robust linear classification system for Named Entity Recognition. A similar system has been applied to the CoNLL text chunking shared task with state of the art performance. By...
Named Entity Recognition through Classifier Combination (2003)
Radu Florian, Abe Ittycheriah, Hongyan Jing, Tong Zhang
This paper presents a classifier-combination experimental framework for named entity recognition in which four diverse classifiers (robust linear classifier, maximum entropy, transformation-based...
HowtogetaChineseName(Entity): Segmentation and Combination Issues (2003)
Hongyan Jing, Radu Florian, Xiaoqiang Luo, Tong Zhang, Abraham Ittycheriah
When building a Chinese named entity recognition system, one must deal with certain language-specific issues such as whether the model should be based on characters or words. While there is no unique...
Effective Dimension and Generalization of (2003)
We investigate the generalization performance of some learning problems in Hilbert function Spaces. We introduce a concept of scalesensitive effective data dimension, and show that it characterizes...
Data-Dependent Bounds for Bayesian (2003)
We consider Bayesian mixture approaches, where a predictor is constructed by forming a weighted average of hypotheses from some space of functions. While such procedures are known to lead to optimal...
Journal of Machine Learning Research X (2002) X-X Submitted 11/02; Published X/XX (2003)
Shie Mannor, Ron Meir, Tong Zhang
Many regression and classification algorithms proposed over the years can described as greedy procedures for the stagewise minimization of an appropriate cost function. Some examples include additive...
Mental State Detection Of Dialogue System Users (2003)
Tong Zhang, Mark Hasegawa-johnson, Stephen E. Levinson
This paper presents an approach to simulate the mental activities of children during their interaction with computers through their spoken language. The mental activities are categorized into three...
Bayesian approaches to learning and estimation have played a significant role in the Statistics literature over many years. While they are often provably optimal in a frequentist setting, and lead to...
An FPGA Implementation of (3,6)-Regular Low-Density Parity-Check Code Decoder (2003)
Because of their excellent error-correcting performance, low-density parity-check (LDPC) codes have recently attracted a lot of attention. In this paper, we are interested in the practical LDPC code...
Experiments in High-Dimensional Text Categorization (2002)
Fred J. Damerau, Tong Zhang, Sholom M. Weiss, Nitin Indurkhya
We present resultsf2 automated text categorization of the Reuters-810000 collection of news stories. Our experiments use the entire one-year collection of 810,000 stories and the entire subject...
We study how close the optimal Bayes error rate can be approximately reached using a classification algorithm that computes a classifier by minimizing a convex upper bound of the classification error...
Text Chunking using Regularized Winnow (2002)
Many machine learning methods have recently been applied to natural language processing tasks. Among them, the Winnow algorithm has been argued to be particularly suitable for NLP problems, due to...
The Consistency of Greedy Algorithms for Classification (2002)
Shie Mannor, Ron Meir, Tong Zhang
We consider a class of algorithms for classification, which are based on sequential greedy minimization of a convex upper bound on the 0 - 1 loss function. A large class of recently popular...
Data-Dependent Bounds for Bayesian Mixture Methods (2002)
We consider Bayesian mixture approaches, where a predictor is constructed by forming a weighted average of hypotheses from some space of functions. While such procedures are known to lead to optimal...
The Consistency of Greedy Algorithms for Classification (2002)
Shie Mannor, Ron Meir, Tong Zhang
We consider a class of algorithms for classification, which are based on sequential greedy minimization of a convex upper bound on the 0 - 1 loss function. A large class of recently popular...
Tong Zhang, Fred Damerau, David Johnson, James Hammerton, Miles Osborne, Susan Armstrong, ...
This paper describes a text chunking system based on a generalization of the Winnow algorithm. We propose a general statistical model for text chunking which we then convert into a classification...
Tong Zhang, Fred Damerau, David Johnson, James Hammerton, Miles Osborne, Susan Armstrong, ...
This paper describes a text chunking system based on a generalization of the Winnow algorithm. We propose a general statistical model for text chunking which we then convert into a classification...
Recently, sample complexity bounds have been derived for problems involving linear functions such as neural networks and support vector machines. In many of these theoretical studies, the concept of...
A General Greedy Approximation Algorithm with Applications (2002)
Greedy approximation algorithms have been frequently used to obtain sparse solutions to learning problems. In this paper, we present a general greedy algorithm for solving a class of convex...
Generalization Performance of Some Learning Problems in Hilbert Functional Spaces (2002)
We investigate the generalization performance of some learning problems in Hilbert functional Spaces. We introduce a notion of convergence of the estimated functional predictor to the best underlying...
Tong Zhang, Vijay S. Iyengar, Pack Kaelbling
Recommender systems use historical data on user preferences and other available data on users (for example, demographics) and items (for example, taxonomy) to predict items a new user might like....
We study how close the optimal Bayes error rate can be approximately reached using a classification algorithm that computes a classifier by minimizing a convex upper bound of the classification error...
On the Dual Formulation of Regularized Linear Systems With Convex Risks (2002)
Many learning problems can be posed as regularized linear systems. In this paper, we study a family of dual algorithms for solving these systems. Applications of these dual algorithms in statistical...
A Decision-Tree-Based Symbolic Rule Induction System for Text Categorization (2002)
David E. Johnson, Frank J. Oles, Tong Zhang, Thilo Goetz
We present a decision-tree-based symbolic rule induction system whose purpose is to categorize text documents automatically. Our method for rule induction involves the novel combination of (1) a fast...
Text Chunking based on a Generalization of Winnow (2002)
Tong Zhang, Fred Damerau, David Johnson
This paper describes a text chunking system based on a generalization of the Winnow algorithm.
Text Categorization Based on Regularized Linear Classification Methods (2002)
A number of linear classification methods such as the linear least squares fit (LLSF), logistic regression, and support vector machines (SVM's) have been applied to text categorization problems....
The Value of Unlabeled Data for Classification Problems (2002)
Recently, there has been increasing interest in using unlabeled data for classification.
A Leave-one-out Cross Validation Bound for Kernel Methods with Applications in Learning (2002)
In this paper, we prove a general leave-one-out style crossvalidation bound for Kernel methods. We apply this bound to some classification and regression problems, and compare the results with...
Bonn, Universiẗat, Diss., 2003 (Nicht für den Austausch).
Thesis (Ph. D.)--Cornell University, Jan., 2002.
We study how close the optimal Bayes error rate can be approximately reached using a classification algorithm that computes a classifier by minimizing a convex upper bound of the classification error...
Regularized Winnow Methods (2001)
In theory, the Winnow multiplicative update has certain advantages over the Perceptron additive update when there are many irrelevant attributes.
Convergence of Large Margin Separable Linear Classification (2001)
Large margin linear classification methods have been successfully applied to many applications. For a linearly separable problem, it is known that under appropriate assumptions, the expected...
A Class Of Efficient-Encoding Generalized (2001)
In this paper, we investigate an efficient encoding approach for generalized low-density (GLD) parity check codes, a generalization of Gallager's low-density parity check (LDPC) codes. We propose a...
Thesis (Ph. D.)--University of Leeds (Department of Physics and Astronomy), 2001.
Large Margin Winnow Methods for Text Categorization (2000)
The SNoW (Sparse Network of Winnows) architecture has recently been successful applied to a number of natural language processing (NLP) problems. In this paper, we propose large margin versions of...
Active Learning using Adaptive Resampling (2000)
Vijay Iyengar, Chid Apte, Tong Zhang, Vijay S. Iyenf, Chidandp Apte, Ton Zhan
Classification modeling (a.k.a. supervised learning) is an extremely useful analytical technique for developing predictive and forecasting applications. The explosive growth in data warehousing and...
A Progressive Ziv-Lempel Algorithm for (2000)
Daniel Greene, Mohan Vishwanath, Frances Yao, Tong Zhang
We describe an algorithm that gives a progression of compressed versions of a single image. Each stage of the progression is a lossy compression of the image, with the distortion decreasing in each...
Characterization of Axin2 and studies on the dominant defects of Axin(Fu) (2000)
Axin was shown to be a negative regulator of axis formation in mouse and Xenopus embryos. Previous studies suggest that Axin carry out this function by forming a complex with GSK-3β, APC and...
Classification and Retrieval of Sound Effects in Audiovisual Data Management (1999)
We present a method for the classification of sound effects which exploits time-frequency analysis of audio signals and uses the hidden Markov model as the classifier. The proposed approach can be...
Analysis of Regularized Linear Functions for Classification Problems (1999)
this paper, we extend some theoretical results in this area byproviding convergence analysis for regularized linear functions with an emphasis on classication problems. The class of methods we study...
Improving the performance of a traffic data management system. (1999)
Thesis (M.S.)--Ohio University, June, 1999.
Thesis (Ph. D.)--University of Southern California, 1999.
Improving the performance of a traffic data management system (1999)
The Traffic Data Management System (TDMS) is a product of Lucent Tech- nologies designed to satisfy the growing need for a single traffic data collection and analysis system for an entire...
Content-Based Classification and Retrieval of Audio (1998)
An online audio classification and segmentation system is presented in this research, where audio recordings are classified and segmented into speech, music, several types of environmental sounds and...
Hierarchical System for Content-based Audio Classification and Retrieval (1998)
A hierarchical system for audio classification and retrieval based on audio content analysis is presented in this paper. The system consists of three stages. The audio recordings are first classified...
Hierarchical System for Content-based Audio Classification and Retrieval (1998)
A hierarchical system for audio classification and retrieval based on audio content analysis is presented in this paper. The system consists of three stages. The audio recordings are first classified...
Experiments with Data Flow and Mutation Testing (1998)
A. Jefferson Offutt, Jie Pan, Tong Zhang, Kanupriya Tewary
This paper presents two experimental comparisons of data flow and mutation testing. These two techniques are widely considered to be effective for unit-level software testing, but can only be...
Content-Based Classification and Retrieval of Audio (1998)
An online audio classification and segmentation system is presented in this research, where audio recordings are classified and segmented into speech, music, several types of environmental sounds and...
Axisymmetric Solutions of the Euler Equations for Sub-Square Polytropic Gases (1998)
We establish rigorously the existence of a three-parameter family of self-similar,globally bounded, and continuous weak solutions in two space dimensions to the compressible Euler equations with...
Densities of Self-Similar Measures on the Line (1996)
Robert S. Strichartz, Arthur Taylor, Tong Zhang
this article are presented in schematic form. Of course, they were actually coded in computer programs: see the section on Electronic Availability at the end. The programs were written in C and...
Typescript (photocopy).
Densities of self-similar measures on the line (1995)
Strichartz, Robert S., Taylor, Arthur, Zhang, Tong
We describe algorithms to compute self-similar measures associated to iterated function systems (i.f.s.) on an interval, and more general self-replicating measures that include Hausdorff measure on...
Thesis (M.A.)--Simon Fraser University, 1993.
Thesis (Ph. D.)--North Carolina State University.
Thesis (Ph. D.)--North Carolina State University.
Includes abstract.
Höherfrequente Übertragungseigenschaften der Kraftfahrzeugfahrwerksysteme / (1991)
Thesis (doctoral)--Universität Berlin, 1991.
Roberson, Mark S., Ban, Makiko, Zhang, Tong, Mulvaney, Jennifer M.
The aim of these studies was to elucidate a role for epidermal growth factor (EGF) signaling in the transcriptional regulation of the glycoprotein hormone α subunit gene, a subunit of chorionic...
Zhang, Tong, Kee, Wei Hua, Seow, Kah Tong, Fung, Winnie, Cao, Xinmin
STAT proteins are a family of latent transcription factors that mediate the response to various cytokines and growth factors. Upon stimulation by cytokines, STAT proteins are recruited to the...
Jho, Eek-hoon, Zhang, Tong, Domon, Claire, Joo, Choun-Ki, Freund, Jean-Noel, Costantini, Frank
Axin2/Conductin/Axil and its ortholog Axin are negative regulators of the Wnt signaling pathway, which promote the phosphorylation and degradation of β-catenin. While Axin is expressed ubiquitously,...