Peter Bartlett

Margin adaptive model selection in statistical learning (2008)

Arlot, Sylvain, Bartlett, Peter

A classical condition for fast learning rates is the margin condition, first introduced by Mammen and Tsybakov. We tackle in this paper the problem of adaptivity to this condition in the context of...

Margin adaptive model selection in statistical learning (2008)

Arlot, Sylvain, Bartlett, Peter

A classical condition for fast learning rates is the margin condition, first introduced by Mammen and Tsybakov. We tackle in this paper the problem of adaptivity to this condition in the context of...

Learning the Kernel Matrix with Semi-Definite (2002)

Peter Bartlett

Kernel-based learning algorithms work by embedding the data into a Euclidean space, and then searching for linear relations among the embedded data points. The embedding is performed implicitly, by...

Learning the Kernel Matrix (2002)

Gert Lanckriet, Nello Cristianini, Peter Bartlett, Laurent El Ghaoui, Michael I. Jordan

Kernel-based learning algorithms work by embedding the data into a Euclidean space, and then searching for linear relations among the embedded data points. The embedding is performed implicitly, by...

Generalization in Threshold Networks, Combined Decision Trees and Combined Mask Perceptrons (2002)

Llew Mason, Peter Bartlett, Mostefa Golea

We derive an upper bound on the generalization error of classifiers from a certain class of threshold networks. The bound depends on the margin of the classifier and the average complexity of the...

Boosting Algorithms as Gradient Descent (2002)

Llew Mason, Jonathan Baxter, Peter Bartlett, Marcus Frean

Much recent attention, both experimental and theoretical, has been focussed on classification algorithms which produce voted combinations of classifiers. Recent theoretical work has shown that the...

Direct Optimization of Margins Improves Generalization in Combined Classifiers (2002)

Llew Mason, Peter Bartlett, Jonathan Baxter

Sonar Cumulative training margin distributions for AdaBoost versus our "Direct Optimization Of Margins" (DOOM) algorithm.

Learning the Kernel Matrix with Semi-Definite Programming (2002)

Gert Lanckriet, Nello Cristianini, Peter Bartlett, Laurent El Ghaoui

Kernel-based learning algorithms work by embedding the data into a Euclidean space, and then searching for linear relations among the embedded data points. The embedding is performed implicitly, by...

Localized Rademacher Complexities (2002)

Peter Bartlett, Olivier Bousquet, Shahar Mendelson

In this article we investigate the behaviour of the global and local Rademacher averages. We present new error bounds which are based on the localized averages and indicate how data-dependent...

Direct Optimization of Margins Improves Generalization in Combined Classifiers (2001)

Llew Mason, Peter Bartlett, Jonathan Baxter

0 0 1 Cumulative training margin distributions for AdaBoost versus our "Direct Optimization Of Margins" (DOOM) algorithm. The dark curve is AdaBoost, the light curve is DOOM. DOOM sacrifices...

Sparse Greedy Gaussian Process Regression (2001)

Alex J. Smola, Peter Bartlett

We present a simple sparse greedy technique to approximate the maximum a posteriori estimate of Gaussian Processes with much improved scaling behaviour in the sample size m. In particular,...

Boosting the Margin: A New Explanation for the Effectiveness of Voting Methods (2001)

Robert E. Schapire, Yoav Freund, Peter Bartlett, Wee Sun Lee

. One of the surprising recurring phenomena observed in experiments with boosting is that the test error of the generated classifier usually does not increase as its size becomes very large, and...

Boosting the Margin: A New Explanation for the Effectiveness of Voting Methods (2000)

Robert E. Schapire, Yoav Freund, Peter Bartlett, Wee Sun Lee

. One of the surprising recurring phenomena observed in experiments with boosting is that the test error of the generated classifier usually does not increase as its size becomes very large, and...

Sparse Greedy Gaussian Process Regression (2000)

Alex J. Smola, Peter Bartlett

We present a simple sparse greedy technique to approximate the maximum a posteriori estimate of Gaussian Processes with much improved scaling behaviour in the sample size m. In particular,...

Model Selection and Error Estimation (2000)

Bartlett, Peter, Boucheron, Stéphane, Lugosi, Gábor

We study model selection strategies based on penalized empirical loss minimization. We point out a tight relationship between error estimation and data-based complexity penalization: any good error...

Direct Gradient-Based Reinforcement Learning: II. Gradient Ascent Algorithms and Experiments (2000)

Jonathan Baxter, Lex Weaver, Peter Bartlett

In [2] we introduced GPOMDP, an algorithm for computing arbitrarily accurate approximations to the performance gradient of parameterized partially observable Markov decision processes (POMDPs). The...

Valid Generalisation from Approximate Interpolation (2000)

Martin Anthony, Peter Bartlett, Yuval Ishai, John Shawe-taylor

Let H and C be sets of functions from domain X to R. We say that H validly generalises C from approximate interpolation if and only if for each j ? 0 and ffl; ffi 2 (0; 1) there is m 0 (j; ffl; ffi)...

Valid Generalisation from Approximate Interpolation (2000)

Martin Anthony, Peter Bartlett, Yuval Ishai, John Shawe-taylor

Let H and C be sets of functions from domain X to !. We say that H validly generalises C from approximate interpolation if and only if for each j ? 0 and ffl; ffi 2 (0; 1) there is m 0 (j; ffl; ffi )...

Boosting Algorithms as Gradient Descent (2000)

Llew Mason, Jonathan Baxter, Peter Bartlett, Marcus Frean

We provide an abstract characterization of boosting algorithms as gradient descent on cost-functionals in an inner-product function space. We prove convergence of these functional-gradient-descent...

Advances in Large Margin Classifiers (2000)

Alexander J. Smola, Peter Bartlett, London England, Bernhard Scholkopf, Dale Schuurmans

Contents Preface vii 1 Introduction to Large Margin Classifiers 1 Alex J. Smola, Peter Bartlett, Bernhard Scholkopf, and Dale Schuurmans 2 Large Margin Rank Boundaries for Ordinal Regression 29 Ralf...

Direct Gradient-Based Reinforcement Learning: II. Gradient Ascent Algorithms and Experiments (2000)

Jonathan Baxter, Lex Weaver, Peter Bartlett

In [2] we introduced GPOMDP, an algorithm for computing arbitrarily accurate approximations to the performance gradient of parameterized partially observable Markov decision processes (POMDPs).

Direct Gradient-Based Reinforcement Learning: (1999)

Jonathan Baxter, Lex Weaver, Peter Bartlett

In [2] we introduced GPOMDP, an algorithm for computing arbitrarily accurate approximations to the performance gradient of parameterized partially observable Markov decision processes (POMDPs).

Classification on Proximity Data with LP--Machines (1999)

Thore Graepel, Ralf Herbrich, Bernhard Scholkopf, Alex Smola, Peter Bartlett

We provide a new linear program to deal with classification of data in the case of functions written in terms of pairwise proximities. This allows to avoid the problems inherent in using feature...

Shrinking the Tube: A New Support Vector Regression Algorithm (1999)

Bernhard Sch Olkopf, Peter Bartlett, Alex Smola

A new algorithm for Support Vector regression is described. For a priori chosen , it automatically adjusts a flexible tube of minimal radius to the data such that at most a fraction of the data...

Direct Gradient-Based Reinforcement Learning: (1999)

Jonathan Baxter, Lex Weaver, Peter Bartlett

In [2] we introduced GPOMDP, an algorithm for computing arbitrarily accurate approximations to the performance gradient of parameterized partially observable Markov decision processes (POMDPs).

The Minimax Distortion Redundancy in Empirical Quantizer Design (1999)

Peter Bartlett

We obtain minimax lower and upper bounds for the expected distortion redundancy of empirically designed vector quantizers. We show that the mean squared distortion of a vector quantizer designed from...

Direct Gradient-Based Reinforcement Learning: II. Gradient Ascent Algorithms and Experiments (1999)

Jonathan Baxter, Lex Weaver, Peter Bartlett

In [2] we introduced GPOMDP, an algorithm for computing arbitrarily accurate approximations to the performance gradient of parameterized partially observable Markov decision processes (POMDPs). The...

Classification on Proximity Data with LP--Machines (1999)

Thore Graepel, Ralf Herbrich, Bernhard Scholkopf, Alex Smola, Peter Bartlett

We provide a new linear program to deal with classification of data in the case of data given in terms of pairwise proximities. This allows to avoid the problems inherent in using feature spaces with...

Classification on Proximity Data with LP--Machines (1999)

Thore Graepel, Ralf Herbrich, Bernhard Scholkopf, Alex Smola, Peter Bartlett

We provide a new linear program to deal with classification of data in the case of data given in terms of pairwise proximities. This allows to avoid the problems inherent in using feature spaces with...

Boosting Algorithms as Gradient Descent (1999)

Llew Mason, Jonathan Baxter, Peter Bartlett, Marcus Frean

Much recent attention, both experimental and theoretical, has been focussed on classification algorithms which produce voted combinations of classifiers. Recent theoretical work has shown that the...

Boosting Algorithms as Gradient Descent in Function Space (1999)

Llew Mason, Jonathan Baxter, Peter Bartlett, Marcus Frean

Much recent attention, both experimental and theoretical, has been focussed on classification algorithms which produce voted combinations of classifiers. Recent theoretical work has shown that the...

Improved Generalization through Explicit Optimization of Margins (1999)

Peter Bartlett, Jonathan Baxter

Recent theoretical results have shown that the generalization performance of thresholded convex combinations of base classifiers is greatly improved if the underlying convex combination has large...

Advances in Large Margin Classifiers (1999)

Alexander J. Smola, Peter Bartlett, London England, Bernhard Scholkopf, Dale Schuurmans

this paper are taken from (Herbrich et al., 1999) Smola, Bartlett, Scholkopf, and Schuurmans: Advances in Large Margin Classifiers 1999/03/31 11:08

Classification on Proximity Data with LP--Machines (1999)

Thore Graepel, Ralf Herbrich, Bernhard Scholkopf, Alex Smola, Peter Bartlett

We provide a new linear program to deal with classification of data in the case of functions written in terms of pairwise proximities. This allows to avoid the problems inherent in using feature...

Classification on Proximity Data with LP-Machines (1999)

Thore Graepel, Ralf Herbrich, Bernhard Scholkopf, Alex Smola, Peter Bartlett

We provide a new linear program to deal with classification of data in the case of functions written in terms of pairwise proximities. This allows to avoid the problems inherent in using feature...

Boosting the Margin: A New Explanation for the Effectiveness of Voting Methods (1999)

Robert E. Schapire, Yoav Freund, Peter Bartlett, Wee Sun Lee

. One of the surprising recurring phenomena observed in experiments with boosting is that the test error of the generated classifier usually does not increase as its size becomes very large, and...

Direct Optimization of Margins Improves Generalization in Combined Classifiers (1999)

Llew Mason, Peter Bartlett, Jonathan Baxter

0 0 1 Cumulative training margin distributions for AdaBoost versus our "Direct Optimization Of Margins" (DOOM) algorithm. The dark curve is AdaBoost, the light curve is DOOM. DOOM sacrifices...

Generalization in Threshold Networks, Combined Decision Trees and Combined Mask Perceptrons (1999)

Llew Mason, Peter Bartlett, Mostefa Golea

We derive an upper bound on the generalization error of classifiers from a certain class of threshold networks. The bound depends on the margin of the classifier and the average complexity of the...

Hardness Results for Neural Network Approximation Problems (1999)

Peter Bartlett

this paper, we show that this approximation problem is hard for several neural network classes.

Boosting the Margin: A New Explanation for the Effectiveness of Voting Methods (1999)

Robert E. Schapire, Yoav Freund, Peter Bartlett, Wee Sun Lee

. One of the surprising recurring phenomena observed in experiments with boosting is that the test error of the generated classifier usually does not increase as its size becomes very large, and...

Boosting the margin: a new explanation for the effectiveness of voting methods (1998)

Bartlett, Peter, Freund, Yoav, Lee, Wee Sun, Schapire, Robert E.

One of the surprising recurring phenomena observed in experiments with boosting is that the test error of the generated classifier usually does not increase as its size becomes very large, and often...

Boosting the Margin: A New Explanation for the Effectiveness of Voting Methods (1998)

Robert E. Schapire, Yoav Freund, Peter Bartlett, Wee Sun Lee

. One of the surprising recurring phenomena observed in experiments with boosting is that the test error of the generated hypothesis usually does not increase as its size becomes very large, and...

Boosting the Margin: A New Explanation for the Effectiveness of Voting Methods (1998)

Robert E. Schapire, Yoav Freund, Peter Bartlett, Wee Sun Lee

. One of the surprising recurring phenomena observed in experiments with boosting is that the test error of the generated classifier usually does not increase as its size becomes very large, and...

Boosting the Margin: A New Explanation for the Effectiveness of Voting Methods (1998)

Robert E. Schapire, Yoav Freund, Peter Bartlett, Wee Sun Lee

. One of the surprising recurring phenomena observed in experiments with boosting is that the test error of the generated classifier usually does not increase as its size becomes very large, and...

Boosting the margin: A new explanation for the effectiveness of voting methods (1998)

Robert E. Schapire, Yoav Freund, Peter Bartlett, Wee Sun Lee

. One of the surprising recurring phenomena observed in experiments with boosting is that the test error of the generated hypothesis usually does not increase as its size becomes very large, and...

Direct Optimization of Margins Improves Generalization in Combined Classifiers (1998)

Llew Mason, Peter Bartlett, Jonathan Baxter

0 0 1 Cumulative training margin distributions for AdaBoost versus our "Direct Optimization Of Margins" (DOOM) algorithm. The dark curve is AdaBoost, the light curve is DOOM. DOOM sacrifices...

Direct Optimization of Margins Improves Generalization in Combined Classifiers (1998)

Llew Mason, Peter Bartlett, Jonathan Baxter

0 0 1 Cumulative training margin distributions for AdaBoost versus our "Direct Optimization Of Margins" (DOOM) algorithm. The dark curve is AdaBoost, the light curve is DOOM. DOOM sacrifices...

Generalization Performance of Support Vector Machines and Other Pattern Classifiers (1998)

Peter Bartlett, John Shawe-taylor

this paper has been twofold. Firstly, we have stated the known results for high confidence bounds on the generalization error of SVMs in terms of the margin and number of support vectors. Secondly,...

Generalization Performance of Support Vector Machines and Other Pattern Classifiers (1998)

Peter Bartlett, John Shawe-taylor

this paper has been twofold. Firstly, we have stated the known results for high confidence bounds on the generalization error of SVMs in terms of the margin and number of support vectors. Secondly,...

Boosting the Margin: A New Explanation for the Effectiveness of Voting Methods (1998)

Robert E. Schapire, Yoav Freund, Peter Bartlett, Wee Sun Lee

. One of the surprising recurring phenomena observed in experiments with boosting is that the test error of the generated classifier usually does not increase as its size becomes very large, and...

The Minimax Distortion Redundancy in Empirical Quantizer Design (1998)

Peter Bartlett

We obtain minimax lower and upper bounds for the expected distortion redundancy of empirically designed vector quantizers. We show that the mean squared distortion of a vector quantizer designed from...

An Inequality for Uniform Deviations of Sample Averages from their Means (1998)

Bartlett, Peter, Lugosi, Gábor

We derive a new inequality for uniform deviations of averages from their means. The inequality is a common generalization of previous results of Vapnik and Chervonenkis (1974) and Pollard (1986)....

Boosting the Margin: A New Explanation for the Effectiveness of Voting Methods (1997)

Robert E. Schapire, Yoav Freund, Peter Bartlett, Wee Sun Lee

One of the surprising recurring phenomena observed in experiments with boosting is that the test error of the generated hypothesis usually does not increase as its size becomes very large, and often...

A Result Relating Convex N-Widths to Covering Numbers With Some Applications to Neural Networks (1997)

Jonathan Baxter, Peter Bartlett

. In general, approximating classes of functions defined over high-dimensional input spaces by linear combinations of a fixed set of basis functions or "features" is known to be hard. Typically, the...

The Minimax Distortion Redundancy in Empirical Quantizer Design (1997)

Peter Bartlett

We obtain minimax lower and upper bounds for the expected distortion redundancy of empirically designed vector quantizers. We show that the mean squared distortion of a vector quantizer designed from...

The Minimax Distortion Redundancy in Empirical Quantizer Design (1997)

Peter Bartlett

We obtain minimax lower and upper bounds for the expected distortion redundancy of empirically designed vector quantizers. We show that the mean squared distortion of a vector quantizer designed from...

A Result Relating Convex (1997)

Jonathan Baxter, Peter Bartlett

. In general, approximating classes of functions defined over high-dimensional input spaces by linear combinations of a fixed set of basis functions or "features" is known to be hard. Typically, the...

Error and Variance Bounds on Sigmoidal Neurons with Weight and Input Errors (1997)

David Lovell, Peter Bartlett, Tom Downs

We derive bounds on the expectation and variance of errors at the output of a multi-layer feedforward neural network with perturbed weights and inputs. It is assumed that errors in weights and inputs...

The Minimax Distortion Redundancy in Empirical Quantizer Design (1997)

Bartlett, Peter, Linder, Tamás, Lugosi, Gábor

We obtain minimax lower and upper bounds for the expected distortion redundancy of empirically designed vector quantizers. We show that the mean squared distortion of a vector quantizer designed from...

A Minimax Lower Bound for Empirical Quantizer Design (Extended Abstract) (1996)

Peter Bartlett

We obtain a minimax lower bound for the expected distortion of empirically designed vector quantizers. We show that the mean squared distortion of any empirically designed vector quantizer is at...

Exponential Convergence of a Gradient Descent Algorithm for a Class of Recurrent Neural Networks (1996)

Peter Bartlett, Soura Dasgupta

We investigate the convergence properties of a gradient descent learning algorithm for a class of recurrent neural networks. The networks compute an affine combination of nonlinear (sigmoidal)...

Exponential Convergence of a Gradient Descent Algorithm for a Class of Recurrent Neural Networks (1996)

Peter Bartlett, Soura Dasgupta

We investigate the convergence properties of a gradient descent learning algorithm for a class of recurrent neural networks. The networks compute an affine combination of nonlinear (sigmoidal)...

A Framework for Stuctural Risk Minimisation (1996)

John Shawe-taylor, Peter Bartlett, Robert Williamson, Martin Anthony

The paper introduces a framework for studying structural risk minimisation.

!()+, -./01 23456 (1994)

Martin Anthony, Peter Bartlett

In this paper, we study a statistical property of classes of real-valued functions that we call approximation from interpolated examples. We derive a characterization of function classes that have...

Valid Generalisation from Approximate Interpolation (1994)

Martin Anthony, Peter Bartlett, Yuval Ishai, John Shawe-taylor

Let H and C be sets of functions from domain X to R. We say that H validly generalises C from approximate interpolation if and only if for each j ? 0 and ffl; ffi 2 (0; 1) there is m 0 (j; ffl; ffi)...

Error and Variance Bounds on Sigmoidal Neurons with Weight and Input Errors (1992)

David Lovell, Peter Bartlett, Tom Downs

We derive bounds on the expectation and variance of errors at the output of a multi-layer feedforward neural network with perturbed weights and inputs. It is assumed that errors in weights and inputs...