Dynamic Web Log Session Identification with Statistical Language Models (2004)
Xiangji Huang, Fuchun Peng, Aijun An, Dale Schuurmans
We present a novel session identification method based on statistical language modeling.
Applying Machine Learning to Text (2004)
Xiangji Huang, Fuchun Peng, Dale Schuurmans, Nick Cercone, Stephen E. Robertson
We propose a self-supervised word segmentation technique for text segmentation in Chinese information retrieval. This method combines the advantages of traditional dictionary based, character based...
Combining Statistical Language Models via the Latent Maximum Entropy Principle (2004)
Shaojun Wang, Dale Schuurmans, Fuchun Peng, Yunxin Zhao
In this paper, we present a unified probabilistic framework for statistical language modeling which can simultaneously incorporate various aspects of natural language, such as local word interaction,...
Using Self-Supervised Word Segmentation in (2004)
Fuchun Peng, Xiangji Huang, Dale Schuurmans, Nick Cercone, Stephen Robertson
We propose a self-supervised word-segmentation technique for Chinese information retrieval. This method combines the advantages of traditional dictionary based approaches with character based...
Regret-based Utility Elicitation in Constraint-based Decision Problems (2004)
Craig Boutilier, Relu Patrascu, Pascal Poupart, Dale Schuurmans
Constraint-based optimization requires the formulation of a precise objective function. However, in many circumstances, the objective is to maximize the utility of a specific user among the space of...
Boosting in the limit: Maximizing the margin of learned ensembles (2004)
Adam J. Grove, Dale Schuurmans
The "minimum margin" of an ensemble classifier on a given training set is, roughly speaking, the smallest vote it gives to any correct training label. Recent work has shown that the Adaboost...
Investigating the Relationship between Word Segmentation (2004)
Fuchun Peng, Xiangji Huang, Dale Schuurmans, Nick Cercone
It is commonly believed that word segmentation accuracy is monotonically related to retrieval performance in Chinese information retrieval. In this paper we show that, for Chinese, the relationship...
Language and Task Independent Text Categorization (2004)
Fuchun Peng, Dale Schuurmans, Shaojun Wang
We present a simple method for language independent and task independent text categorization learning, based on character-level n-gram language models. Our approach uses simple information theoretic...
Regularized Greedy Importance Sampling (2003)
Finnegan Southey, Dale Schuurmans, Ali Ghodsi
Greedy importance sampling is an unbiased estimation technique that reduces the variance of standard importance sampling by explicitly searching for modes in the estimation objective. Previous work...
Automatic Complexity Control for (2003)
As a prerequisite for system identi cation based on c-mean clustering (FCM), it is necessary to assign the number of underlying partitions to be used for a given data set. However, for the FCM...
Automatic basis selection for RBF networks using Stein's unbiased risk estimator (2003)
The problem of selecting the appropriate number of basis functions is a critical issue for radial basis function neural networks. An RBF network with an overly restricted basis gives poor predictions...
Automatic Basis Selection Techniques for RBF Networks (2003)
This paper proposes a generic criterion that defines the optimum number of basis functions for radial basis function neural networks. The generalization performance of an RBF network relates to its...
Constraint-based Optimization with the Minimax (2003)
Craig Boutilier, Relu Patrascu, Pascal Poupart, Dale Schuurmans
In many situations, a set of hard constraints encodes the feasible configurations of some system or product over which users have preferences. We consider the problem of computing a best feasible...
Boltzmann Machine Learning with (2003)
Shaojun Wang, Dale Schuurmans, Fuchun Peng, Yunxin Zhao
We present a new statistical learning paradigm for Boltzmann machines based on a new inference principle we have proposed: the latent maximum entropy principle (LME). LME is dierent both from Jaynes'...
Learning Mixture Models with the Latent Maximum Entropy Principle (2003)
Shaojun Wang, Dale Schuurmans, Fuchun Peng, Yunxin Zhao
We present a new approach to estimating mixture models based on a new inference principle we have proposed: the latent maximum entropy principle (LME). LME is different both from Jaynes' maximum...
Data Perturbation for Escaping Local Maxima in Learning (2003)
Gal Elidan, Matan Ninio, Nir Friedman, Dale Schuurmans
Almost all machine learning algorithms---be they for regression, classification or density estimation---seek hypotheses that optimize a score on training data. In most interesting cases, however,...
Text Classification in Asian Languages without Word Segmentation (2003)
Fuchun Peng, Xiangji Huang, Dale Schuurmans, Shaojun Wang
We present a simple approach for Asian language text classification without word segmentation, based on statistical n-gram language modeling. In particular, we examine Chinese and Japanese text...
Text Classification in Asian Languages without Word Segmentation (2003)
Fuchun Peng, Xiangji Huang, Dale Schuurmans, Shaojun Wang
We present a simple approach for Asian language text classification without word segmentation, based on statistical language modeling. In particular, we examine Chinese and Japanese text...
Regularized Greedy Importance Sampling (2003)
Finnegan Southey, Dale Schuurmans, Ali Ghodsi
Greedy importance sampling is an unbiased estimation technique that reduces the variance of standard importance sampling by explicitly searching for modes in the estimation objective. Previous work...
Boltzmann Machine Learning with the Latent Maximum Entropy Principle (2003)
Shaojun Wang, Dale Schuurmans, Fuchun Peng, Yunxin Zhao
We present a new statistical learning paradigm for Boltzmann machines based on a new inference principle we have proposed: the latent maximum entropy principle (LME). LME is dierent both from Jaynes'...
Language Independent Authorship Attribution using Character Level Language Models (2003)
Fuchun Peng, Dale Schuurmans, Viado Keselj, Shaojun Wang
We present a method for computerassisted authorship attribution based on character-level -gram language models.
Language and Task Independent Text Categorization with Simple Language Models (2003)
Fuchun Peng, Dale Schuurmans, Shaojun Wang
We present a simple method for language independent and task independent text categorization learning, based on character-level n-gram language models. Our approach uses simple information theoretic...
Language and Task Independent Text Categorization (2003)
Fuchun Peng, Dale Schuurmans, Shaojun Wang
We present a simple method for language independent and task independent text categorization learning, based on character-level n-gram language models. Our approach uses simple information theoretic...
Language Independent Authorship Attribution using Character Level Language Models (2003)
Fuchun Peng, Dale Schuurmans, Vlado Keselj, Shaojun Wang
We present a method for computerassisted authorship attribution based on character-level n-gram language models.
Waterloo at NTCIR-3: Using Self-supervised Word Segmentation (2003)
Xiangji Huang, Fuchun Peng, Dale Schuurmans, Nick Cercone
In this paper, we describe the system we use in the NTCIR-3 CLIR (cross language IR) task. We participate the SLIR (single language IR) track. In our system, we use a self-supervised...
Combining Naive Bayes and n-Gram Language Models for Text Classification (2003)
We augment the naive Bayes model with an n-gram language model to address two shortcomings of naive Bayes text classifiers.
Combining Naive Bayes and n-Gram Language (2003)
We augment the naive Bayes model with an n-gram language model to address two shortcomings of naive Bayes text classi ers.
Session Boundary Detection for Association (2002)
Xiangji Huang, Fuchun Peng, Aijun An, Dale Schuurmans, Nick Cercone
We present a statistical method using n-gram language models to identify session boundaries in a large collection of Livelink log data.
Waterloo at NTCIR-3: Using Self-supervised Word Segmentation (2002)
Xiangji Huang, Fuchun Peng, Dale Schuurmans, Nick Cercone
In this paper, we describe the system we use in the NTCIR-3 CLIR (cross language IR) task. We participate the SLIR (single language IR) track. In our system, we use a self-supervised...
Session Boundary Detection for Association Rule Learning Using n-Gram Language Models (2002)
Xiangji Huang, Fuchun Peng, Aijun An, Dale Schuurmans, Nick Cercone
We present a statistical method using n-gram language models to identify session boundaries in a large collection of Livelink log data.
The Latent Maximum Entropy Principle (2002)
Shaojun Wang, Dale Schuurmans, Yunxin Zhao
We present an extension to Jaynes' maximum entropy principle that handles latent variables. The principle of latent maximum entropy we propose is di#erent from both Jaynes' maximum entropy principle...
Using Self-Supervised Word Segmentation in Chinese Information Retrieval (2002)
Fuchun Peng, Xiangji Huang, Dale Schuurmans, Nick Cercone, Stephen Robertson
We propose a self-supervised word-segmentation technique for Chinese information retrieval. This method combines the advantages of traditional dictionary based approaches with character based...
Fuchun Peng, Xiangji Huang, Dale Schuurmans, Nick Cercone
It is commonly believed that word segmentation accuracy is monotonically related to retrieval performance in Chinese information retrieval. In this paper we show that, for Chinese, the relationship...
Latent Maximum Entropy Approach for Semantic N-gram Language Modeling (2002)
Shaojun Wang, Dale Schuurmans, Fuchun Peng
In this paper, we describe a unified probabilistic framework for statistical language modeling -- the latent maximum entropy principle -- which can effectively incorporate various aspects of natural...
Semantic N-Gram Language Modeling With The Latent Maximum Entropy Principle (2002)
Shaojun Wang, Dale Schuurmans, Fuchun Peng, Yunxin Zhao
In this paper, we describe a unified probabilistic framework for statistical language modeling -- the latent maximum entropy principle -- which can effectively incorporate various aspects of natural...
Combining Naive Bayes and n-Gram Language Models for Text Classification (2002)
We augment the naive Bayes model with an n-gram language model to address two shortcomings of naive Bayes text classifiers.
Latent Maximum Entropy Approach for (2002)
Shaojun Wang, Dale Schuurmans, Fuchun Peng
In this paper, we describe a uni ed probabilistic framework for statistical language modeling|the latent maximum entropy principle|which can eectively incorporate various aspects of natural language,...
Probabilistic Hill-Climbing (2002)
William W. Cohen, Russell Greiner, Dale Schuurmans
Many learning tasks involve searching through a discrete space of performance elements, seeking an element whose future utility is expected to be high. As the task of finding the global optimum is...
On the Existence of Linear Weak Learners and Applications to Boosting (2002)
Ron Meir, Yoshua Bengio, Dale Schuurmans
We consider the existence of a linear weak learner for boosting algorithms. A weak learner for binary classification problems is required to achieve a weighted empirical error on the training set...
Greedy linear value-approximation for factored Markov decision processes (2002)
Relu Patrascu, Pascal Poupart, Dale Schuurmans, Craig Boutilier, Carlos Guestrin
Significant recent work has focused on using linear representations to approximate value functions for factored Markov decision processes (MDPs). Current research has adopted linear programming as an...
Piecewise Linear Value Function Approximation for Factored MDPs (2002)
Pascal Poupart, Craig Boutilier, Relu Patrascu, Dale Schuurmans
A number of proposals have been put forth in recent years for the solution of Markov decision processes (MDPs) whose state (and sometimes action) spaces are factored.
Algorithm-Directed Exploration for Model-Based Reinforcement Learning in Factored MDPs (2002)
Carlos Guestrin, Relu Patrascu, Dale Schuurmans
One of the central challenges in reinforcement learning is to balance the exploration/exploitation tradeoff while scaling up to large problems. Although model-based reinforcement learning has been...
Investigating the Maximum Likelihood Alternative to TD(lambda) (2002)
Fletcher Lu, Relu Patrascu, Dale Schuurmans
The study of value estimation in Markov reward processes has been dominated by research on temporal difference methods since the introduction of TD(0) in 1988. Temporal difference methods are often...
Data Perturbation for Escaping Local Maxima in Learning (2002)
Gal Elidan, Matan Ninio, Nir Friedman, Dale Schuurmans
Almost all machine learning algorithms -- be they for regression, classification or density estimation -- seek hypotheses that optimize a score on training data. In most interesting cases, however,...
Direct value-approximation for factored MDPs (2002)
Dale Schuurmans, Relu Patrascu
We present a simple approach for computing reasonable policies for factored Markov decision processes (MDPs), when the optimal value function can be approximated by a compact linear form.
Investigating the Maximum Likelihood Alternative to ... (2002)
Fletcher Lu, Relu Patrascu, Dale Schuurmans
The study of value estimation in Markov reward processes has been dominated by research on temporal difference methods since the introduction of TD(0) in 1988. Temporal difference methods are often...
Direct value-approximation for factored MDPs (2002)
Dale Schuurmans, Relu Patrascu
We present a simple approach for computing reasonable policies for factored Markov decision processes (MDPs), when the optimal value function can be approximated by a compact linear form.
A Hierarchical EM Approach to Word Segmentation (2002)
We propose a simple two-level hierarchical probability model for unsupervised word segmentation. By treating words as strings composed of morphemes /phonemes which are themselves composed of...
A Simple Closed-Class/Open-Class Factorization for Improved (2002)
We describe a simple improvement to n- gram language models where we estimate the distribution over closed-class (function) words separately from the conditional distribution of open-class words...
The Sparse Data Problem in Statistical Language Modeling and Unsupervised Word Segmentation (2001)
Fuchun Peng, Supervisor Prof, Dale Schuurmans, Prof Frank Tompa
The sparse data problem is one of the most important problems in natural language processing. In this thesis, we are focusing on the sparse data problem in statistical language modeling and...
The Latent Maximum Entropy Principle (2001)
Shaojun Wang, Ronald Rosenfeld, Yunxin Zhao, Dale Schuurmans
In this paper, we present an extension of Jaynes' maximum entropy principle to handle latent variables. We use an EM algorithm that incorporates nested iterative scaling to calculate maximum entropy...
A Simple Closed-Class/Open-Class Factorization for Improved Language Modeling (2001)
We describe a simple improvement to n-gram language models where we estimate the distribution over closed-class (function) words separately from the conditional distribution of open-class words given...
A Hierarchical EM Approach to Word Segmentation (2001)
We propose a simple two-level hierarchical probability model for unsupervised word segmentation. By treating words as strings composed of morphemes /phonemes which are themselves composed of...
A Hierarchical EM Approach to Word Segmentation (2001)
We propose a simple two-level hierarchical probability model for unsupervised word segmentation. By treating words as strings composed of morphemes/phonemes which are themselves composed of...
The Exponentiated Subgradient Algorithm for Heuristic Boolean Programming (2001)
Dale Schuurmans, Finnegan Southey, Robert C. Holte
Boolean linear programs (BLPs) are ubiquitous in AI. Satisfiability testing, planning with resource constraints, and winner determination in combinatorial auctions are all examples of this type of...
The Exponentiated Subgradient Algorithm for Heuristic Boolean Programming (2001)
Dale Schuurmans, Finnegan Southey, Robert C. Holte
Boolean linear programs (BLPs) are ubiquitous in AI. Satisfiability testing, planning with resource constraints, and winner determination in combinatorial auctions are all examples of this type of...
The Exponentiated Subgradient Algorithm for Heuristic Boolean Programming (2001)
Dale Schuurmans, Finnegan Southey, Robert C. Holte
Boolean linear programs (BLPs) are ubiquitous in AI. Satisfiability testing, planning with resource constraints, and winner determination in combinatorial auctions are all examples of this type of...
Direct value-approximation for factored MDPs (2001)
Dale Schuurmans, Relu Patrascu
We present a simple approach for computing near-optimal policies in factored Markov decision processes (MDPs), when the optimal value function can be approximated by a compact linear form. Our method...
Local search characteristics of incomplete SAT procedures (2001)
Dale Schuurmans, Finnegan Southey
Effective local search methods for finding satisfying assignments of CNF formulae exhibit several...
Local search characteristics of incomplete SAT procedures (2001)
Dale Schuurmans, Finnegan Southey
Eective local search methods for nding satisfying assignments of CNF formulae exhibit several systematic characteristics in their search. We identify a series of measurable characteristics of local...
Direct value-approximation for factored MDPs (2001)
Dale Schuurmans, Relu Patrascu
We present a simple approach for computing near-optimal policies
Self-supervised Chinese Word Segmentation (2001)
We propose a new unsupervised training method for acquiring...
The Exponentiated Subgradient Algorithm for Heuristic Boolean Programming (2001)
Dale Schuurmans, Finnegan Southey, Robert C. Holte
Boolean linear programs (BLPs) are ubiquitous in AI. Satisfiability testing, planning with resource constraints, and winner determination in combinatorial auctions are all examples of this type of...
The Exponentiated Subgradient Algorithm for Heuristic Boolean Programming (2001)
Dale Schuurmans, Finnegan Southey, Robert C. Holte
Boolean linear programs (BLPs) are ubiquitous in AI. Satisfiability testing, planning with resource constraints, and winner determination in combinatorial auctions are all examples of this type of...
Metric-Based Methods for Adaptive Model Selection and Regularization (2001)
Dale Schuurmans, Finnegan Southey
We present a general approach to model selection and regularization that exploits unlabeled data to adaptively control hypothesis complexity in supervised learning tasks. The idea is to impose a...
Efficient Exploration for Optimizing Immediate Reward (2001)
Dale Schuurmans, Lloyd Greenwald
We consider the problem of learning an effective behavior strategy from reward. Although much studied, the issue of how to use prior knowledge to scale optimal behavior learning up to real-world...
Local search characteristics of incomplete SAT procedures (2001)
Dale Schuurmans, Finnegan Southey
Eective local search methods for nding satisfying assignments of CNF formulae exhibit several systematic characteristics in their search. We identify a series of measurable characteristics of local...
Metric-Based Methods for Adaptive Model Selection and Regularization (2000)
Dale Schuurmans, Finnegan Southey
We present a general approach to model selection and regularization that exploits unlabeled data to adaptively control hypothesis complexity in supervised learning tasks. The idea is to impose a...
Monte Carlo inference via greedy importance sampling (2000)
Dale Schuurmans, Finnegan Southey
We present a new method for conducting Monte Carlo inference in graphical models which combines explicit search with generalized importance sampling. The idea is to reduce the variance of importance...
General Convergence Results for Linear Discriminant Updates (2000)
Adam J. Grove, Nick Littlestone, Dale Schuurmans
The problem of learning linear-discriminant concepts can be solved by various mistakedriven update procedures, including the Winnow family of algorithms and the wellknown Perceptron algorithm. In...
Monte Carlo inference via greedy importance sampling (2000)
Dale Schuurmans, Finnegan Southey
We present a new method for conducting Monte Carlo inference in graphical models which combines explicit search with generalized importance sampling. The idea is to reduce the variance of importance...
An Adaptive Regularization Criterion for Supervised Learning (2000)
We introduce a new regularization criterion that exploits unlabeled data to adaptively control hypothesis-complexity in general supervised learning tasks. The technique is based on an abstract...
An Adaptive Regularization Criterion for Supervised Learning (2000)
We introduce a new regularization criterion that exploits unlabeled data to adaptively control hypothesis-complexity in general supervised learning tasks. The technique is based on an abstract...
Local search characteristics of incomplete SAT procedures (2000)
Dale Schuurmans, Finnegan Southey
Effective local search methods for finding satisfying assignments of CNF formulae exhibit several systematic characteristics in their search. We identify a series of measurable characteristics of...
Advances in Large Margin Classifiers (2000)
Alexander J. Smola, Peter Bartlett, London England, Bernhard Scholkopf, Dale Schuurmans
Contents Preface vii 1 Introduction to Large Margin Classifiers 1 Alex J. Smola, Peter Bartlett, Bernhard Scholkopf, and Dale Schuurmans 2 Large Margin Rank Boundaries for Ordinal Regression 29 Ralf...
An Adaptive Regularization Criterion for Supervised Learning (2000)
We introduce a new regularization criterion that exploits unlabeled training data to adaptively control hypothesis-complexity in general supervised learning tasks. The technique is based on an...
Greedy Importance Sampling (1999)
I present a simple variation of importance sampling that explicitly searches for important regions in the target distribution. I prove that the technique yields unbiased estimates, and show...
General Convergence Results for Linear Discriminant Updates (1999)
Adam J. Grove, Nick Littlestone, Dale Schuurmans
The problem of learning linear-discriminant concepts can be solved by various mistakedriven update procedures, including the Winnow family of algorithms and the wellknown Perceptron algorithm. In...
Efficient Exploration for Optimizing Immediate Reward (1999)
Dale Schuurmans, Lloyd Greenwald
We consider the problem of learning an effective behavior strategy from reward. Although much studied, the issue of how to use prior knowledge to scale optimal behavior learning up to real-world...
Advances in Large Margin Classifiers (1999)
Alexander J. Smola, Peter Bartlett, London England, Bernhard Scholkopf, Dale Schuurmans
this article also provide a website to obtain the data
On Learning Hierarchical Classifications (1999)
Russell Greiner, Adam Grove, Dale Schuurmans
Many significant real-world classification tasks involve a large number of categories which are arranged in a hierarchical structure; for example, classifying documents into subject categories under...