Dale Schuurmans

Dynamic Web Log Session Identification with Statistical Language Models (2004)

Xiangji Huang, Fuchun Peng, Aijun An, Dale Schuurmans

We present a novel session identification method based on statistical language modeling.

Applying Machine Learning to Text (2004)

Xiangji Huang, Fuchun Peng, Dale Schuurmans, Nick Cercone, Stephen E. Robertson

We propose a self-supervised word segmentation technique for text segmentation in Chinese information retrieval. This method combines the advantages of traditional dictionary based, character based...

Combining Statistical Language Models via the Latent Maximum Entropy Principle (2004)

Shaojun Wang, Dale Schuurmans, Fuchun Peng, Yunxin Zhao

In this paper, we present a unified probabilistic framework for statistical language modeling which can simultaneously incorporate various aspects of natural language, such as local word interaction,...

Using Self-Supervised Word Segmentation in (2004)

Fuchun Peng, Xiangji Huang, Dale Schuurmans, Nick Cercone, Stephen Robertson

We propose a self-supervised word-segmentation technique for Chinese information retrieval. This method combines the advantages of traditional dictionary based approaches with character based...

Regret-based Utility Elicitation in Constraint-based Decision Problems (2004)

Craig Boutilier, Relu Patrascu, Pascal Poupart, Dale Schuurmans

Constraint-based optimization requires the formulation of a precise objective function. However, in many circumstances, the objective is to maximize the utility of a specific user among the space of...

Boosting in the limit: Maximizing the margin of learned ensembles (2004)

Adam J. Grove, Dale Schuurmans

The "minimum margin" of an ensemble classifier on a given training set is, roughly speaking, the smallest vote it gives to any correct training label. Recent work has shown that the Adaboost...

Investigating the Relationship between Word Segmentation (2004)

Fuchun Peng, Xiangji Huang, Dale Schuurmans, Nick Cercone

It is commonly believed that word segmentation accuracy is monotonically related to retrieval performance in Chinese information retrieval. In this paper we show that, for Chinese, the relationship...

Language and Task Independent Text Categorization (2004)

Fuchun Peng, Dale Schuurmans, Shaojun Wang

We present a simple method for language independent and task independent text categorization learning, based on character-level n-gram language models. Our approach uses simple information theoretic...

Regularized Greedy Importance Sampling (2003)

Finnegan Southey, Dale Schuurmans, Ali Ghodsi

Greedy importance sampling is an unbiased estimation technique that reduces the variance of standard importance sampling by explicitly searching for modes in the estimation objective. Previous work...

Automatic Complexity Control for (2003)

Ali Ghodsi, Dale Schuurmans

As a prerequisite for system identi cation based on c-mean clustering (FCM), it is necessary to assign the number of underlying partitions to be used for a given data set. However, for the FCM...

Automatic basis selection for RBF networks using Stein's unbiased risk estimator (2003)

Ali Ghodsi, Dale Schuurmans

The problem of selecting the appropriate number of basis functions is a critical issue for radial basis function neural networks. An RBF network with an overly restricted basis gives poor predictions...

Automatic Basis Selection Techniques for RBF Networks (2003)

Ali Ghodsi, Dale Schuurmans

This paper proposes a generic criterion that defines the optimum number of basis functions for radial basis function neural networks. The generalization performance of an RBF network relates to its...

Constraint-based Optimization with the Minimax (2003)

Craig Boutilier, Relu Patrascu, Pascal Poupart, Dale Schuurmans

In many situations, a set of hard constraints encodes the feasible configurations of some system or product over which users have preferences. We consider the problem of computing a best feasible...

Boltzmann Machine Learning with (2003)

Shaojun Wang, Dale Schuurmans, Fuchun Peng, Yunxin Zhao

We present a new statistical learning paradigm for Boltzmann machines based on a new inference principle we have proposed: the latent maximum entropy principle (LME). LME is dierent both from Jaynes'...

Learning Mixture Models with the Latent Maximum Entropy Principle (2003)

Shaojun Wang, Dale Schuurmans, Fuchun Peng, Yunxin Zhao

We present a new approach to estimating mixture models based on a new inference principle we have proposed: the latent maximum entropy principle (LME). LME is different both from Jaynes' maximum...

Data Perturbation for Escaping Local Maxima in Learning (2003)

Gal Elidan, Matan Ninio, Nir Friedman, Dale Schuurmans

Almost all machine learning algorithms---be they for regression, classification or density estimation---seek hypotheses that optimize a score on training data. In most interesting cases, however,...

Text Classification in Asian Languages without Word Segmentation (2003)

Fuchun Peng, Xiangji Huang, Dale Schuurmans, Shaojun Wang

We present a simple approach for Asian language text classification without word segmentation, based on statistical n-gram language modeling. In particular, we examine Chinese and Japanese text...

Text Classification in Asian Languages without Word Segmentation (2003)

Fuchun Peng, Xiangji Huang, Dale Schuurmans, Shaojun Wang

We present a simple approach for Asian language text classification without word segmentation, based on statistical language modeling. In particular, we examine Chinese and Japanese text...

Regularized Greedy Importance Sampling (2003)

Finnegan Southey, Dale Schuurmans, Ali Ghodsi

Greedy importance sampling is an unbiased estimation technique that reduces the variance of standard importance sampling by explicitly searching for modes in the estimation objective. Previous work...

Boltzmann Machine Learning with the Latent Maximum Entropy Principle (2003)

Shaojun Wang, Dale Schuurmans, Fuchun Peng, Yunxin Zhao

We present a new statistical learning paradigm for Boltzmann machines based on a new inference principle we have proposed: the latent maximum entropy principle (LME). LME is dierent both from Jaynes'...

Language Independent Authorship Attribution using Character Level Language Models (2003)

Fuchun Peng, Dale Schuurmans, Viado Keselj, Shaojun Wang

We present a method for computerassisted authorship attribution based on character-level -gram language models.

Language and Task Independent Text Categorization with Simple Language Models (2003)

Fuchun Peng, Dale Schuurmans, Shaojun Wang

We present a simple method for language independent and task independent text categorization learning, based on character-level n-gram language models. Our approach uses simple information theoretic...

Language and Task Independent Text Categorization (2003)

Fuchun Peng, Dale Schuurmans, Shaojun Wang

We present a simple method for language independent and task independent text categorization learning, based on character-level n-gram language models. Our approach uses simple information theoretic...

Language Independent Authorship Attribution using Character Level Language Models (2003)

Fuchun Peng, Dale Schuurmans, Vlado Keselj, Shaojun Wang

We present a method for computerassisted authorship attribution based on character-level n-gram language models.

Waterloo at NTCIR-3: Using Self-supervised Word Segmentation (2003)

Xiangji Huang, Fuchun Peng, Dale Schuurmans, Nick Cercone

In this paper, we describe the system we use in the NTCIR-3 CLIR (cross language IR) task. We participate the SLIR (single language IR) track. In our system, we use a self-supervised...

Combining Naive Bayes and n-Gram Language Models for Text Classification (2003)

Fuchun Peng, Dale Schuurmans

We augment the naive Bayes model with an n-gram language model to address two shortcomings of naive Bayes text classifiers.

Combining Naive Bayes and n-Gram Language (2003)

Fuchun Peng, Dale Schuurmans

We augment the naive Bayes model with an n-gram language model to address two shortcomings of naive Bayes text classi ers.

Session Boundary Detection for Association (2002)

Xiangji Huang, Fuchun Peng, Aijun An, Dale Schuurmans, Nick Cercone

We present a statistical method using n-gram language models to identify session boundaries in a large collection of Livelink log data.

Waterloo at NTCIR-3: Using Self-supervised Word Segmentation (2002)

Xiangji Huang, Fuchun Peng, Dale Schuurmans, Nick Cercone

In this paper, we describe the system we use in the NTCIR-3 CLIR (cross language IR) task. We participate the SLIR (single language IR) track. In our system, we use a self-supervised...

Session Boundary Detection for Association Rule Learning Using n-Gram Language Models (2002)

Xiangji Huang, Fuchun Peng, Aijun An, Dale Schuurmans, Nick Cercone

We present a statistical method using n-gram language models to identify session boundaries in a large collection of Livelink log data.

The Latent Maximum Entropy Principle (2002)

Shaojun Wang, Dale Schuurmans, Yunxin Zhao

We present an extension to Jaynes' maximum entropy principle that handles latent variables. The principle of latent maximum entropy we propose is di#erent from both Jaynes' maximum entropy principle...

Using Self-Supervised Word Segmentation in Chinese Information Retrieval (2002)

Fuchun Peng, Xiangji Huang, Dale Schuurmans, Nick Cercone, Stephen Robertson

We propose a self-supervised word-segmentation technique for Chinese information retrieval. This method combines the advantages of traditional dictionary based approaches with character based...

Investigating the Relationship between Word Segmentation Performance and Retrieval Performance in Chinese IR (2002)

Fuchun Peng, Xiangji Huang, Dale Schuurmans, Nick Cercone

It is commonly believed that word segmentation accuracy is monotonically related to retrieval performance in Chinese information retrieval. In this paper we show that, for Chinese, the relationship...

Latent Maximum Entropy Approach for Semantic N-gram Language Modeling (2002)

Shaojun Wang, Dale Schuurmans, Fuchun Peng

In this paper, we describe a unified probabilistic framework for statistical language modeling -- the latent maximum entropy principle -- which can effectively incorporate various aspects of natural...

Semantic N-Gram Language Modeling With The Latent Maximum Entropy Principle (2002)

Shaojun Wang, Dale Schuurmans, Fuchun Peng, Yunxin Zhao

In this paper, we describe a unified probabilistic framework for statistical language modeling -- the latent maximum entropy principle -- which can effectively incorporate various aspects of natural...

Combining Naive Bayes and n-Gram Language Models for Text Classification (2002)

Fuchun Peng, Dale Schuurmans

We augment the naive Bayes model with an n-gram language model to address two shortcomings of naive Bayes text classifiers.

Latent Maximum Entropy Approach for (2002)

Shaojun Wang, Dale Schuurmans, Fuchun Peng

In this paper, we describe a uni ed probabilistic framework for statistical language modeling|the latent maximum entropy principle|which can eectively incorporate various aspects of natural language,...

Probabilistic Hill-Climbing (2002)

William W. Cohen, Russell Greiner, Dale Schuurmans

Many learning tasks involve searching through a discrete space of performance elements, seeking an element whose future utility is expected to be high. As the task of finding the global optimum is...

On the Existence of Linear Weak Learners and Applications to Boosting (2002)

Ron Meir, Yoshua Bengio, Dale Schuurmans

We consider the existence of a linear weak learner for boosting algorithms. A weak learner for binary classification problems is required to achieve a weighted empirical error on the training set...

Greedy linear value-approximation for factored Markov decision processes (2002)

Relu Patrascu, Pascal Poupart, Dale Schuurmans, Craig Boutilier, Carlos Guestrin

Significant recent work has focused on using linear representations to approximate value functions for factored Markov decision processes (MDPs). Current research has adopted linear programming as an...

Piecewise Linear Value Function Approximation for Factored MDPs (2002)

Pascal Poupart, Craig Boutilier, Relu Patrascu, Dale Schuurmans

A number of proposals have been put forth in recent years for the solution of Markov decision processes (MDPs) whose state (and sometimes action) spaces are factored.

Algorithm-Directed Exploration for Model-Based Reinforcement Learning in Factored MDPs (2002)

Carlos Guestrin, Relu Patrascu, Dale Schuurmans

One of the central challenges in reinforcement learning is to balance the exploration/exploitation tradeoff while scaling up to large problems. Although model-based reinforcement learning has been...

Investigating the Maximum Likelihood Alternative to TD(lambda) (2002)

Fletcher Lu, Relu Patrascu, Dale Schuurmans

The study of value estimation in Markov reward processes has been dominated by research on temporal difference methods since the introduction of TD(0) in 1988. Temporal difference methods are often...

Data Perturbation for Escaping Local Maxima in Learning (2002)

Gal Elidan, Matan Ninio, Nir Friedman, Dale Schuurmans

Almost all machine learning algorithms -- be they for regression, classification or density estimation -- seek hypotheses that optimize a score on training data. In most interesting cases, however,...

Direct value-approximation for factored MDPs (2002)

Dale Schuurmans, Relu Patrascu

We present a simple approach for computing reasonable policies for factored Markov decision processes (MDPs), when the optimal value function can be approximated by a compact linear form.

Investigating the Maximum Likelihood Alternative to ... (2002)

Fletcher Lu, Relu Patrascu, Dale Schuurmans

The study of value estimation in Markov reward processes has been dominated by research on temporal difference methods since the introduction of TD(0) in 1988. Temporal difference methods are often...

Direct value-approximation for factored MDPs (2002)

Dale Schuurmans, Relu Patrascu

We present a simple approach for computing reasonable policies for factored Markov decision processes (MDPs), when the optimal value function can be approximated by a compact linear form.

A Hierarchical EM Approach to Word Segmentation (2002)

Fuchun Peng, Dale Schuurmans

We propose a simple two-level hierarchical probability model for unsupervised word segmentation. By treating words as strings composed of morphemes /phonemes which are themselves composed of...

A Simple Closed-Class/Open-Class Factorization for Improved (2002)

Fuchun Peng, Dale Schuurmans

We describe a simple improvement to n- gram language models where we estimate the distribution over closed-class (function) words separately from the conditional distribution of open-class words...

The Sparse Data Problem in Statistical Language Modeling and Unsupervised Word Segmentation (2001)

Fuchun Peng, Supervisor Prof, Dale Schuurmans, Prof Frank Tompa

The sparse data problem is one of the most important problems in natural language processing. In this thesis, we are focusing on the sparse data problem in statistical language modeling and...

The Latent Maximum Entropy Principle (2001)

Shaojun Wang, Ronald Rosenfeld, Yunxin Zhao, Dale Schuurmans

In this paper, we present an extension of Jaynes' maximum entropy principle to handle latent variables. We use an EM algorithm that incorporates nested iterative scaling to calculate maximum entropy...

A Simple Closed-Class/Open-Class Factorization for Improved Language Modeling (2001)

Fuchun Peng, Dale Schuurmans

We describe a simple improvement to n-gram language models where we estimate the distribution over closed-class (function) words separately from the conditional distribution of open-class words given...

A Hierarchical EM Approach to Word Segmentation (2001)

Fuchun Peng, Dale Schuurmans

We propose a simple two-level hierarchical probability model for unsupervised word segmentation. By treating words as strings composed of morphemes /phonemes which are themselves composed of...

A Hierarchical EM Approach to Word Segmentation (2001)

Fuchun Peng, Dale Schuurmans

We propose a simple two-level hierarchical probability model for unsupervised word segmentation. By treating words as strings composed of morphemes/phonemes which are themselves composed of...

The Exponentiated Subgradient Algorithm for Heuristic Boolean Programming (2001)

Dale Schuurmans, Finnegan Southey, Robert C. Holte

Boolean linear programs (BLPs) are ubiquitous in AI. Satisfiability testing, planning with resource constraints, and winner determination in combinatorial auctions are all examples of this type of...

The Exponentiated Subgradient Algorithm for Heuristic Boolean Programming (2001)

Dale Schuurmans, Finnegan Southey, Robert C. Holte

Boolean linear programs (BLPs) are ubiquitous in AI. Satisfiability testing, planning with resource constraints, and winner determination in combinatorial auctions are all examples of this type of...

The Exponentiated Subgradient Algorithm for Heuristic Boolean Programming (2001)

Dale Schuurmans, Finnegan Southey, Robert C. Holte

Boolean linear programs (BLPs) are ubiquitous in AI. Satisfiability testing, planning with resource constraints, and winner determination in combinatorial auctions are all examples of this type of...

Direct value-approximation for factored MDPs (2001)

Dale Schuurmans, Relu Patrascu

We present a simple approach for computing near-optimal policies in factored Markov decision processes (MDPs), when the optimal value function can be approximated by a compact linear form. Our method...

Local search characteristics of incomplete SAT procedures (2001)

Dale Schuurmans, Finnegan Southey

Effective local search methods for finding satisfying assignments of CNF formulae exhibit several...

Local search characteristics of incomplete SAT procedures (2001)

Dale Schuurmans, Finnegan Southey

Eective local search methods for nding satisfying assignments of CNF formulae exhibit several systematic characteristics in their search. We identify a series of measurable characteristics of local...

Direct value-approximation for factored MDPs (2001)

Dale Schuurmans, Relu Patrascu

We present a simple approach for computing near-optimal policies

Self-supervised Chinese Word Segmentation (2001)

Fuchun Peng, Dale Schuurmans

We propose a new unsupervised training method for acquiring...

The Exponentiated Subgradient Algorithm for Heuristic Boolean Programming (2001)

Dale Schuurmans, Finnegan Southey, Robert C. Holte

Boolean linear programs (BLPs) are ubiquitous in AI. Satisfiability testing, planning with resource constraints, and winner determination in combinatorial auctions are all examples of this type of...

The Exponentiated Subgradient Algorithm for Heuristic Boolean Programming (2001)

Dale Schuurmans, Finnegan Southey, Robert C. Holte

Boolean linear programs (BLPs) are ubiquitous in AI. Satisfiability testing, planning with resource constraints, and winner determination in combinatorial auctions are all examples of this type of...

Metric-Based Methods for Adaptive Model Selection and Regularization (2001)

Dale Schuurmans, Finnegan Southey

We present a general approach to model selection and regularization that exploits unlabeled data to adaptively control hypothesis complexity in supervised learning tasks. The idea is to impose a...

Efficient Exploration for Optimizing Immediate Reward (2001)

Dale Schuurmans, Lloyd Greenwald

We consider the problem of learning an effective behavior strategy from reward. Although much studied, the issue of how to use prior knowledge to scale optimal behavior learning up to real-world...

Local search characteristics of incomplete SAT procedures (2001)

Dale Schuurmans, Finnegan Southey

Eective local search methods for nding satisfying assignments of CNF formulae exhibit several systematic characteristics in their search. We identify a series of measurable characteristics of local...

Metric-Based Methods for Adaptive Model Selection and Regularization (2000)

Dale Schuurmans, Finnegan Southey

We present a general approach to model selection and regularization that exploits unlabeled data to adaptively control hypothesis complexity in supervised learning tasks. The idea is to impose a...

Monte Carlo inference via greedy importance sampling (2000)

Dale Schuurmans, Finnegan Southey

We present a new method for conducting Monte Carlo inference in graphical models which combines explicit search with generalized importance sampling. The idea is to reduce the variance of importance...

General Convergence Results for Linear Discriminant Updates (2000)

Adam J. Grove, Nick Littlestone, Dale Schuurmans

The problem of learning linear-discriminant concepts can be solved by various mistakedriven update procedures, including the Winnow family of algorithms and the wellknown Perceptron algorithm. In...

Monte Carlo inference via greedy importance sampling (2000)

Dale Schuurmans, Finnegan Southey

We present a new method for conducting Monte Carlo inference in graphical models which combines explicit search with generalized importance sampling. The idea is to reduce the variance of importance...

An Adaptive Regularization Criterion for Supervised Learning (2000)

Dale Schuurmans

We introduce a new regularization criterion that exploits unlabeled data to adaptively control hypothesis-complexity in general supervised learning tasks. The technique is based on an abstract...

An Adaptive Regularization Criterion for Supervised Learning (2000)

Dale Schuurmans

We introduce a new regularization criterion that exploits unlabeled data to adaptively control hypothesis-complexity in general supervised learning tasks. The technique is based on an abstract...

Local search characteristics of incomplete SAT procedures (2000)

Dale Schuurmans, Finnegan Southey

Effective local search methods for finding satisfying assignments of CNF formulae exhibit several systematic characteristics in their search. We identify a series of measurable characteristics of...

Advances in Large Margin Classifiers (2000)

Alexander J. Smola, Peter Bartlett, London England, Bernhard Scholkopf, Dale Schuurmans

Contents Preface vii 1 Introduction to Large Margin Classifiers 1 Alex J. Smola, Peter Bartlett, Bernhard Scholkopf, and Dale Schuurmans 2 Large Margin Rank Boundaries for Ordinal Regression 29 Ralf...

An Adaptive Regularization Criterion for Supervised Learning (2000)

Dale Schuurmans

We introduce a new regularization criterion that exploits unlabeled training data to adaptively control hypothesis-complexity in general supervised learning tasks. The technique is based on an...

Greedy Importance Sampling (1999)

Dale Schuurmans

I present a simple variation of importance sampling that explicitly searches for important regions in the target distribution. I prove that the technique yields unbiased estimates, and show...

General Convergence Results for Linear Discriminant Updates (1999)

Adam J. Grove, Nick Littlestone, Dale Schuurmans

The problem of learning linear-discriminant concepts can be solved by various mistakedriven update procedures, including the Winnow family of algorithms and the wellknown Perceptron algorithm. In...

Efficient Exploration for Optimizing Immediate Reward (1999)

Dale Schuurmans, Lloyd Greenwald

We consider the problem of learning an effective behavior strategy from reward. Although much studied, the issue of how to use prior knowledge to scale optimal behavior learning up to real-world...

On Learning Hierarchical Classifications (1999)

Russell Greiner, Adam Grove, Dale Schuurmans

Many significant real-world classification tasks involve a large number of categories which are arranged in a hierarchical structure; for example, classifying documents into subject categories under...