Near-optimal Regret Bounds for Reinforcement Learning (2007)
Auer, Peter, Jaksch, Thomas, Ortner, Ronald
For undiscounted reinforcement learning we consider the total regret of a learning algorithm in respect to an optimal policy. We present a reinforcement learning algorithm with total regret O( DS...
Improved Rates for the Stochastic Continuum-Armed Bandit Problem (2007)
Auer, Peter, Ortner, Ronald, Szepesvari, Csaba
Considering one-dimensional continuum-armed bandit problems, we propose an improvement of an algorithm of Kleinberg and a new set of conditions which give rise to improved rates. In particular, we...
Learning with Malicious Noise (2007)
This problem is concerned with PAC learning of concept classes when training examples are effected by malicious errors. The malicious noise model extends the PAC model by allowing example errors of...
In the on-line learning model the learner needs to make predictions sequentially, one after the other, and receives a reward or loss after each prediction. Typically, the learner receives some input...
Retos en materia de empleo y respuestas de política en Argentina, Brasil y México (2007)
Ernst, Christoph, Berg, Janine, Auer, Peter
En este trabajo se argumenta que para proporcionar trabajo decente -entendido como trabajo formal con cobertura de seguridad social- en Argentina, Brasil y México las políticas económicas y...
Hannan consistency in on-line learning in case of unbounded losses under partial monitoring (2006)
Allenberg, Chamy, Auer, Peter, Györfi, Laszlo, Ottucsak, György
In this paper the sequential prediction problem with expert advice is considered when the loss is unbounded under partial monitoring scenarios. We deal with a wide class of the partial monitoring...
Logarithmic Online Regret Bound for Undiscounted Reinforcement Learning (2006)
We present a learning algorithm for undiscounted reinforcement learning. Our interest lies in bounds for the algorithm's online performance after some finite number of steps. In the spirit of similar...
Logarithmic Online Regret Bounds for Undiscounted Reinforcement Learning (2006)
We present a learning algorithm for undiscounted reinforcement learning. Our interest lies in bounds for the algorithm's online performance after some finite number of steps. In the spirit of similar...
A new PAC bound for intersection-closed concept classes (2006)
For hyper-rectangles in R^d Auer (1997) proved a PAC bound of O((d + log1/delta)/epsilon), where epsilon and delta are the accuracy and confidence parameters. It is still an open question whether one...
A reasoning system to track movements of totally occluded objects (2006)
Antenreiter, Martin, Auer, Peter
We present a system which is able to track objects under partial and total occlusion. The occlusion and movement reasoning uses abstract object descriptions for inference and is based on an...
Visual Classification of Images by Learning Geometric Appearances through Boosting (2006)
Antenreiter, Martin, Savu-Krohn, Christian, Auer, Peter
We present a multiclass classification system for gray value images through boosting. The feature selection is done using the LPBoost algorithm which selects suitable features of adequate type. In...
Visual classification of images by learning geometric appearances through boosting (2006)
Antenreiter, Martin, Savu-Krohn, Christian, Auer, Peter
We present a multiclass classification system for gray value images through boosting. The feature selection is done using the LPBoost algorithm which selects suitable features of adequate type. In...
A distributed voting scheme to maximize preferences (2006)
Auer, Peter, Cesa-Bianchi, Nicolò
We study the problem of designing a distributed voting scheme for electing a candidate that maximizes the preferences of a set of agents. We assume the preference of agent i for candidate j is a real...
Competitive Reinforcement Learning (2005)
We present a new algorithm for undiscounted reinforcement learning. In contrast to the usual convergence analysis we bound the regret of our algorithm: we compare the total reward received by our...
A boosting approach to multiple instance learning (2005)
In this paper we present a boosting approach to multiple-instance learning. As weak hypotheses we use balls (with respect to various metrics) centered at instances of positive bags. For the metric...
Models for Trading Exploration and Exploitation using Upper Confidence Bounds (2005)
Lecture on the use of upper confidence bounds in online learning
A Simple Feature Extraction for High Dimensional Image Representations (2005)
Savu-Krohn, Christian, Auer, Peter
We investigate a method to find local clusters in low dimensional subspaces of high dimensional data, e.g. in high dimensional image descriptions. Using cluster centers instead of the full set of...
Delayed self-repairs as a structuring device for complex turns in conversation (2005)
This paper looks into the interactional basis of the grammatical format of parentheticals. It will be argued that such a basis can be found in abandoned or broken off units of talk in conversation...
In diesem Beitrag werden grundlegende Eigenschaften der mündlichen Interaktion und ihre Konsequenzen für die Theorie der Syntax der Gesprochenen Sprache diskutiert.Mündliche Interaktion hat zum...
Generic object recognition with boosting (2005)
Opelt, Andreas, Fussenegger, Michael, Pinz, Axel, Auer, Peter
This paper presents a powerful framework for generic object recognition. Boosting is used as an underlying learning technique. We use a combination of various weak classifiers of different types of...
On the loss version of the adversarial multi-armed bandit problem (2005)
The Loss Bandit game is the loss variant of the adversarial multi-armed bandit problem. It is carried out in T iterations. At the beginning of any iteration an adversary assigns losses from [0,1] to...
Auer, Peter, Burgsteiner, Harald, Maass, Wolfgang
One may argue that the simplest type of neural networks beyond a single perceptron is an array of several perceptrons in parallel. We refer to such circuits as parallel perceptrons. In spite of their...
Online Regret Bounds for a New Reinforcement Learning Algorithm (2005)
We present a new learning algorithm for undiscounted finite-step reinforcement learning with restarts. Our algorithm is based on upper confidence bounds and achieves logarithmic online regret in...
A Simple Feature Extraction for High Dimensional Image Representations (2005)
Savu-Krohn, Christian, Auer, Peter
We investigate a method to find local clusters in low dimensional subspaces of high dimensional data, e.g. in high dimensional image descriptions. Using cluster centers instead of the full set of...
Object recognition using geometric properties and a variant of boosting (2005)
Antenreiter, Martin, Auer, Peter
This paper describes an approach for learning object descriptions as combinations of simple features using labeled still images. The contribution of this paper is a new method for constructing...
Online Regret Bounds for a New Reinforcement Learning Algorithm (2005)
We present a new learning algorithm for undiscounted finite-step reinforcement learning with restarts. Our algorithm is based on upper confidence bounds and achieves logarithmic online regret in...
Online Regret Bounds for a New Reinforcement Learning Algorithm (2005)
We present a new learning algorithm for undiscounted finite-step reinforcement learning with restarts. Unlike many other analyses in reinforcement learning we are not only interested in convergence...
A boosting approach to multiple instance learning (2004)
In this paper we present a boosting approach to multiple instance learning. As weak hypotheses we use balls (with respect to various metrics) centered at instances of positive bags. For the...
Generic object recognition with boosting (2004)
Opelt, Andreas, Fussenegger, Michael, Pinz, Axel, Auer, Peter
This paper presents a powerful framework for generic object recognition. Boosting is used as an underlying learning technique. For the first time a combination of various weak classifiers of...
Object recognition using segmentation for feature detection (ICPR) (2004)
Fussenegger, Michael, Opelt, Andreas, Pinz, Axel, Auer, Peter
A new method is presented to learn object categories from unlabeled and unsegmented images for generic object recognition. We assume that each object can be characterized by a set of typical regions,...
A new PAC bound for intersection-closed concept classes (2004)
For hyper-rectangles in R^d Auer et al. proved a PAC bound of O((d+log 1/delta)/epsilon), where epsilon and delta are the accuracy and confidence parameters. It is still an open question whether one...
Object recognition using segmentation for feature detection (OeAGM) (2004)
Fussenegger, Michael, Opelt, Andreas, Pinz, Axel, Auer, Peter
A new method is presented to learn object categories from unlabeled and unsegmented images for generic object recognition. We assume that each object can be characterized by a set of typical regions,...
Weak hypotheses and boosting for generic object detection and recognition (2004)
Opelt, Andreas, Fussenegger, Michael, Pinz, Axel, Auer, Peter
In this paper we describe the first stage of a new learning system for object detection and recognition. For our system we propose Boosting [5] as the underlying learning technique. This allows the...
Projection in interaction and projection in grammar (2003)
One of the central theoretical concerns of present-day linguistics is the question ofwhether grammatical knowledge represents an autonomous component of thehuman mind and is therefore independent of...
Die Entstehung von Diskursmarkern im Deutschen : ein Fall von Grammatikalisierung? (2003)
Auer, Peter, Günthner, Susanne
Das Deutsche verfügt - wie viele andere Sprachen auch - über eine Gruppe von sprachlichenZeichen, die optional sind, hauptsächlich oder ausschließlich in der gesprochenen Sprachevorkommen und...
We show how a standard tool from statistics --- namely confidence bounds --- can be used to elegantly deal with situations which exhibit an exploitation-exploration trade-o#. Our technique for...
Schweitzer,G. E., Long,F. A., Hughes,Robert, Auer,Peter, Smith,Julian
This report presents the highlights of recently completed studies of the contributions to industrial development of science and technology institutions in three developing nations. Common problems...
The Non-Stochastic Multi-Armed Bandit Problem (2001)
Peter Auer, Yoav Freund, Robert E. Schapire
In the multi-armed bandit problem, a gambler must decide which arm of K non-identical slot machines to play in a sequence of trials so as to maximize his reward. This classical problem has received...
Hoch ansetzende' Intonationskonturen in der Hamburger Regionalvarietät (2001)
In diesem Beitrag wird eine Familie von Intonationskonturen beschrieben, die als typisch für die norddeutsche Umgangssprache im Raum Hamburg gelten können. Sie ist charakterisiert durch ihren -...
Adaptive and Self-Confident On-Line Learning Algorithms (2000)
We study on-line learning in the linear regression framework. Most of the performance bounds for on-line algorithms in this framework assume a constant learning rate. To achieve these bounds the...
Finite-time Analysis of the (2000)
Peter Auer, Paul Fischer, Lehrstuhl Informatik
Reinforcement learning policies face the exploration versus exploitation dilemma, i.e. the search for a balance between exploring the environment to nd pro table actions while taking the empirically...
A Learning Rule for Universal Approximators with a Single Non-Linearity (2000)
Harald M. Burgsteiner, Peter Auer, Wolfgang Maass
A learning algorithm is presented for circuits consisting of a single soft winner-take-all or k-winner-take-all gate applied to linear sums. We show that for these circuits gradient descent with...
An Improved On-line Algorithm for Learning Linear Evaluation Functions (2000)
We improve and extend results on a learning model where an algorithm has to make a sequence of choices based on an evaluation function [Lon97]. This evaluation function has to be learned on-line from...
Adaptive and Self-Confident On-Line Learning Algorithms (2000)
Most of the performance bounds for on-line learning algorithms are proven assuming a constant learning rate. To optimize these bounds, the learning rate must be tuned based on quantities that are...
Pre- and post-positioning of wenn-clauses in spoken and written German (2000)
This paper is concerned with the distinction between pre- and postpositioned (initial and final) wenn-clauses in German, and with the distinction between written and spoken language. A simple...
Peter Auer, Mark Herbster, Manfred K. Warmuth, Surrey Tw Ex
We show that for a single neuron with the logistic function as the transfer function the number of local minima of the error function based on the square loss can grow exponentially in the dimension....
Kurt Hornik, Maxwell Stinchcombe, Halbert White, Peter Auer
Recently Barron (1993) has given rates for hidden layer feedforward networks with sigmoid activation functions approximating a class of functions satisfying a certain smoothness condition. These...
J. Kivinen, M. K. Warmuth, Peter Auer
We give an adversary strategy that forces the Perceptron algorithm to make OmegaGamma kN ) mistakes in learning monotone disjunctions over N variables with at most k literals. In contrast,...
Worst-Case Analysis of the Bandit Problem (2000)
The multi-armed bandit is a classical problem in the area of sequential decision theory and has been studied under a variety of statistical assumptions. In this work we investigate the bandit problem...
Peter Auer, Robert C. Holte, Wolfgang Maass
We exhibit a theoretically founded algorithm T2 for agnostic PAClearning of decision trees of at most 2 levels, whose computation time is almost linear in the size of the training set. We evaluate...
Gambling in a Rigged Casino: (2000)
Peter Auer, Yoav Freund, Robert E. Schapire
In the multi-armed bandit problem, a gambler must decide which arm of K non-identical slot machines to play in a sequence of trials so as to maximize his reward. This classical problem has received...
Gambling in a rigged casino: The adversarial multi-armed bandit problem (1999)
Peter Auer, Nicolo Cesa-bianchi, Yoav Freund, Robert E. Schapire
In the multi-armed bandit problem, a gambler must decide which arm of K non-identical slot machines to play in a sequence of trials so as to maximize his reward. This classical problem has received...
Approximating Hyper-Rectangles: Learning and Pseudo-random Sets (1998)
Peter Auer, Philip M. Long, Aravind Srinivasan
The PAC learning of rectangles has been studied because they have been found experimentally to yield excellent hypotheses for several applied learning problems. Also, pseudorandom sets for rectangles...
Unification in the Combination of Disjoint Theories (1998)
We consider unifaction modulo some equational theory E: Given are terms s; t 2 T (E) built from the signature Sigma(E) of E and from variables x in V. A substitution unifies s; t if oe(s) jE oe(t),...
Peter Auer, Stephen Kwek, Wolfgang Maass, Manfred K. Warmuth
We present algorithms for learning depth two neural networks where the hidden nodes are threshold gates with constant fan-in. The transfer function of the output node might be more general: we have...
On the Complexity of Function Learning (1998)
Peter Auer, Philip M. Long, Wolfgang Maass, Gerhard J. Woeginger
The majority of results in computational learning theory are concerned with concept learning, i.e. with the special case of function learning for classes of functions with range f0; 1g. Much less is...
Gambling in a rigged casino: The adversarial multi-armed bandit problem (1998)
Peter Auer, Nicolo Cesa-bianchi, Yoav Freund, Robert E. Schapire
In the multi-armed bandit problem, a gambler must decide which arm of K non-identical slot machines to play in a sequence of trials so as to maximize his reward. This classical problem has received...
On-line Learning of Rectangles in Noisy Environments (1998)
We investigate the implications of noise in the equivalence query model. Besides some results for general target and hypotheses classes, we prove bounds on the learning complexity of d- dimensional...
Exponentially Many Local Minima for Single Neurons (1998)
Peter Auer, Mark Herbster, Manfred K. Warmuth
We show that for a single neuron with the logistic function as the transfer function the number of local minima of the error function based on the square loss can grow exponentially in the dimension....
Solving String Equations with Constant Restrictions (1998)
We present a variant and extension of Makanin's [5] decision algorithm for string equations: Given are sets C = fc 1 ; : : : ; c m g of constants, V = fx 1 ; : : : ; x n g of variables and a string...
Simulating Access to Hidden Information while Learning (1998)
We introduce a new technique which enables a learner without access to hidden information to learn nearly as well as a learner with access to hidden information. We apply our technique to solve an...
Theory and Applications of Agnostic PAC-Learning with Small Decision Trees (1998)
Peter Auer, Robert C. Holte, Wolfgang Maass
We exhibit a theoretically founded algorithm T2 for agnostic PAC-learning of decision trees of at most 2 levels, whose computation time is almost linear in the size of the training set. We evaluate...
On-line Learning with Malicious Noise and the Closure Algorithm (1998)
. We investigate a variant of the on-line learning model for classes of f0; 1g-valued functions (concepts) in which the labels of a certain amount of the input instances are corrupted by adversarial...
On Learning from Multi-Instance Examples: Empirical Evaluation of a Theoretical Approach (1998)
We describe a practical algorithm for learning axis-parallel high-dimensional boxes from multi-instance examples. The first solution to this practical learning problem arising in drug design was...
Tracking the Best Disjunction (1998)
Peter Auer, Manfred K. Warmuth
Littlestone developed a simple deterministic on-line learning algorithm for learning k-literal disjunctions. This algorithm (called Winnow)keeps one weight for each of the n variables and does...
Approximating Hyper-Rectangles: Learning and Pseudo-random Sets (1998)
Peter Auer, Philip M. Long, Aravind Srinivasan
The PAC learning of rectangles has been studied because they have been found experimentally to yield excellent hypotheses for several applied learning problems. Also, pseudorandom sets for rectangles...
Learning Nested Differences in the Presence of Malicious Noise (1998)
. We investigate the learnability of nested differences of intersection-closed classes in the presence of malicious noise. Examples of intersection-closed classes include axis-parallel rectangles,...
Gambling in a rigged casino: The adversarial multi-armed bandit problem (1998)
Peter Auer, Nicolo Cesa-bianchi, Yoav Freund, Robert E. Schapire
In the multi-armed bandit problem, a gambler must decide which arm of K non-identical slot machines to play in a sequence of trials so as to maximize his reward. This classical problem has received...
Gambling in a rigged casino: The adversarial multi-armed bandit problem (1998)
Peter Auer, Yoav Freund, Robert E. Schapire
In the multi-armed bandit problem, a gambler must decide which arm of K non-identical slot machines to play in a sequence of trials so as to maximize his reward. This classical problem has received...
Exponentially Many Local Minima for Single Neurons (1998)
Peter Auer, Mark Herbster, Manfred K. Warmuth
We show that for a single neuron with the logistic function as the transfer function the number of local minima of the error function based on the square loss can grow exponentially in the dimension....
Tracking the Best Disjunction (1998)
Peter Auer, Manfred K. Warmuth
. Littlestone developed a simple deterministic on-line learning algorithm for learning k-literal disjunctions. This algorithm (called Winnow) keeps one weight for each of the n variables and does...
Relative Loss Bounds for Entropy Based Updates (Preliminary Bibliography) (1998)
References [1] N. Littlestone. Learning quickly when irrelevant attributes abound: A new linear-threshold algorithm. Machine Learning, 2:285--318, 1988. [2] N. Littlestone. Mistake Bounds and...
Tight Bounds on the Cumulative Profit of Distributed Voters (1997)
Peter Auer, Pasquale Caianiello, Nicol Cesa-bianchi
In this paper we study the problem of making distributed decisions with the goal of maxim izing a given utility function. We show that a set of voters can make, via an election, a nearly optimal...
Learning of Depth Two Neural Networks with Constant Fan-in at the Hidden Nodes (extended ) (1997)
Peter Auer, Stephen Kwek, Wolfgang Maass, Manfred K. Warmuth
We present algorithms for learning depth two neural networks where the hidden nodes are threshold gates with constant fan-in. The transfer function of the output node might be more general: we have...
Peter Auer, Stephen Kwek, Wolfgang Maass, Manfred K. Warmuth
We present algorithms for learning depth two neural networks where the hidden nodes are threshold gates with constant fan-in. The transfer function of the output node might be more general: we have...
Learning of Depth Two Neural Networks with Constant Fan-in at the Hidden Nodes (extended ) (1997)
Peter Auer, Stephen Kwek, Wolfgang Maass, Manfred K. Warmuth
We present algorithms for learning depth two neural networks where the hidden nodes are threshold gates with constant fan-in. The transfer function of the output node might be more general: we have...
Theory and Applications of Agnostic PAC-learning with Small Decision Trees (1997)
Peter Auer, Robert C. Holte, Wolfgang Maass
We exhibit a new algorithm T2 for agnostic PAC-learning with decision trees of at most 2-levels, whose computation time is almost linear in the size of the training set. We evaluate the performance...
Gambling in a rigged casino: The adversarial multi-armed bandit problem (1996)
Peter Auer, Nicolo Cesa-bianchi, Yoav Freund, Robert E. Schapire
In the multi-armed bandit problem, a gambler must decide which arm of K non-identical slot machines to play in a sequence of trials so as to maximize his reward. This classical problem has received...
Theory and Applications of Agnostic (1996)
Peter Auer, Robert C. Holte, Wolfgang Maass
We exhibit a theoretically founded algorithm T2 for agnostic PAClearning of decision trees of at most 2 levels, whose computation time is almost linear in the size of the training set. We evaluate...
Exponentially Many Local Minima for Single Neurons (1996)
Peter Auer, Mark Herbster, Manfred K. Warmuth
We show that for a single neuron with the logistic function as the transfer function the number of local minima of the error function based on the square loss can grow exponentially in the dimension....
Tracking the Best Disjunction (1995)
Peter Auer, Manfred K. Warmuth
Littlestone developed a simple deterministic on-line learning algorithm for learning k-literal disjunctions. This algorithm (called Winnow)keeps one weight for each of the n variables and does...