Regret to the Best vs. Regret to the average. (2007)
Even-Dar, Eyal, Kearns, Michael, Mansour, Yishay, Wortman, Jenn
We study online regret minimization algorithms in a bicriteria setting, examining not only the standard notion of regret to the best expert, but also the regret to the average of all experts, the...
Economic Properties of Social Networks (2004)
Sham M. Kakade, Michael Kearns, Luis E. Ortiz, Robin Pemantle, Siddharth Suri
We examine the marriage of recent probabilistic generative models for social networks with classical frameworks from mathematical economics.
Sham M. Kakade, Michael Kearns, Luis E. Ortiz
We introduce a graph-theoretic generalization of classical Arrow-Debreu economics, in which an undirected graph specifies which consumers or economies are permitted to engage in direct trade, and the...
Competitive Algorithms for VWAP (2004)
Sham M. Kakade, Michael Kearns, Yishay Mansour, Luis E. Ortiz
We introduce new online models for two important aspects of modern financial markets: Volume Weighted Average Price trading and limit order books. We provide an extensive study of competitive...
IEEE IEEE INTELLIGENT SYSTEMS Published by the IEEE Computer Society (2003)
this article, available at www.cis. upenn.edu/~mkearns/papers/plat.pdf.) PXS simulations run in one of two modes. Live mode updates the Island data in real time from BookViewer approximately every...
The Penn-Lehman Automated Trading Project (2003)
We describe the Penn-LehmanAutomated Trading project, a broad investigation of algorithms and strategies for automated trading in financial markets that centers around the Penn Exchange Simulator...
Algorithms for Interdependent Security Games (2003)
this paper we examine such computational issues
The Penn-Lehman Automated Trading Project (2003)
We describe the Penn-LehmanAutomated Trading project, a broad investigation of algorithms and strategies for automated trading in financial markets that centers around the Penn Exchange Simulator...
Nash Propagation for Loopy Graphical Games (2003)
We introduce NashProp, an iterative and local message-passing algorithm for computing Nash equilibria in multi-player games represented by arbitrary undirected graphs. We provide a formal analysis...
Correlated Equilibria in Graphical Games (2003)
Michael Kearns, John Langford, Luis Ortiz
We examine correlated equilibria in the recently introduced formalism of graphical games, a succinct representation for multiplayer games. We establish a natural and powerful relationship between the...
Correlated Equilibria in Graphical Games (2003)
Michael Kearns, John Langford, Luis Ortiz
We examine correlated equilibria in the recently introduced formalism of graphical games, a succinct representation for multiplayer games. We establish a natural and powerful relationship between the...
Eric Allender, Sanjeev Arora, Michael Kearns, Cristopher Moore, Alexander Russell
We establish a new hardness result that shows that the difficulty of planning in factored Markov decision processes is representational rather than just computational. More precisely, we give a fixed...
Nash Propagation for Loopy Graphical Games (2003)
We introduce NashProp, an iterative and local message-passing algorithm for computing Nash equilibria in multi-player games represented by arbitrary undirected graphs. We provide a formal analysis...
A Note on the Representational Incompatibility (2002)
Eric Allender, Sanjeev Arora, Michael Kearns, Cristopher Moore, Alexander Russell
We establish a new hardness result that shows that the difficulty of planning in factored Markov decision processes is representational rather than just computational. More precisely, we give a fixed...
NJFun: A Reinforcement Learning Spoken Dialogue System (2002)
Diane Litman, Satinder Singh, Michael Kearns, Marilyn Walker
This paper describes NJFun, a real-time spoken dia- logue system'tkatprovides users with information about things to d in New Jersey. NJFun auto- rustically optimizes its dialogue strategy over time,...
Eric Allender, Sanjeev Arora, Michael Kearns, Cristopher Moore, Alexander Russell
We establish a new hardness result that shows that the difficulty of planning in factored Markov decision processes is representational rather than just computational. More precisely, we give a fixed...
Machine Learning, 27, 7--50 (1997) c (2002)
Michael Kearns, Dana Ron, M. Long
We investigate the problem of model selection in the setting of supervised learning of boolean functions from independent random examples. More precisely, we compare methods for finding a balance...
Efficient Nash Computation in Large Population Games (2002)
Michael Kearns, Yishay Mansour
We introduce a general representation of largepopulation games in which each player's influence on the others is centralized and limited, but may otherwise be arbitrary. This representation...
Cobot: A Social Reinforcement Learning Agent (2002)
Charles Lee Isbell, Jr. Christian, R. Shelton, Michael Kearns, Satinder Singh, Peter Stone
We report on the use of reinforcement learning with Cobot, a software agent residing in the well-known online community LambdaMOO. Our initial work on Cobot (Isbell et al.2000) provided him with the...
An Efficient, Exact Algorithm for Solving Tree-Structured Graphical Games (2002)
Michael L. Littman, Michael Kearns, Satinder Singh
We describe a new algorithm for computing a Nash equilibrium in graphical games, a compact representation for multi-agent systems that we introduced in previous work. The algorithm is the first to...
Optimizing Dialogue Management with Reinforcement Learning: Experiments with the NJFun System (2002)
Satinder Singh, Diane Litman, Michael Kearns
Designing the dialogue policy of a spoken dialogue system involves many nontrivial choices. This paper presents a reinforcement learning approach for automatically optimizing a dialogue policy, which...
A Social Reinforcement Learning Agent (2001)
Charles Lee Isbell, Jr. Christian, R. Shelton, Michael Kearns, Satinder Singh, Peter Stone
We report on our reinforcement learning work on Cobot, a software agent that resides in the well-known online chat community LambdaMOO. Our initial work on Cobot (Isbell et al., 2000) provided him...
Near-Optimal Reinforcement Learning in Polynomial Time (2001)
Michael Kearns, Satinder Singh
We present new algorithms for reinforcement learning and prove that they have polynomial bounds on the resources required to achieve near-optimal return in general Markov decision processes. After...
Weakly Learning DNF and Characterizing Statistical Query Learning Using Fourier Analysis (2001)
Avrim Blum, Merrick Furst, Michael Kearns, Yishay Mansour, Tel-aviv U, Steven Rudich, ...
We present new results on the well-studied problem of learning DNF expressions. We prove that an algorithm due to Kushilevitz and Mansour [13] can be used to weakly learn DNF formulas with membership...
Peter Stone, Michael L. Littman, Satinder Singh, Michael Kearns
The First Trading Agent Competition (TAC) was held from June 22nd to July 8th, 2000. TAC was designed to create a benchmark problem in the complex domain of emarketplaces and to motivate researchers...
Optimizing Dialogue Management with Reinforcement Learning: Experiments with the NJFun System (2001)
Satinder Singh, Diane Litman, Michael Kearns, Marilyn Walker
Designing the dialogue policy of a spoken dialogue system involves many nontrivial choices. This paper presents a reinforcement learning approach for automatically optimizing a dialogue policy, which...
Cobot: A Social Reinforcement Learning Agent (2001)
Michael Kearns, Satinder Singh, Peter Stone
We report on the use of reinforcement learning with Cobot, a software agent residing in the wellknown online community LambdaMOO. Our initial work on Cobot (Isbell et al.2000) provided him with the...
Optimizing Dialogue Management with Reinforcement Learning: Experiments with the NJFun System (2001)
Satinder Singh, Diane Litman, Michael Kearns, Marilyn Walker
Designing the dialogue policy of a spoken dialogue system involves many nontrivial choices. This paper presents a reinforcement learning approach for automatically optimizing a dialogue policy, which...
An Efficient Exact Algorithm for Singly Connected Graphical Games (2001)
Michael L. Littman, Michael Kearns, Satinder Singh
We describe a new algorithm for computing a Nash equilibrium in graphical games , a compact representation for multi-agent systems that we introduced in previous work. The algorithm is the rst to...
ATTac-2000: An Adaptive Autonomous Bidding Agent (2001)
Peter Stone, Michael L. Littman, Satinder Singh, Michael Kearns
The First Trading Agent Competition (TAC) was held from June 22nd to July 8th, 2000. TAC was designed to create a benchmark problem in the complex domain of emarketplaces and to motivate researchers...
Graphical Models for Game Theory (2001)
Michael Kearns, Michael L. Littman, Satinder Singh
We introduce a compact graph-theoretic representation for multi-party game theory. Our main result is a provably correct and efficient algorithm for computing approximate Nash equilibria in one-stage...
A Sparse Sampling Algorithm for Near-Optimal (2001)
Michael Kearns, Yishay Mansour, Andrew Y. Ng
A critical issue for the application of Markov decision processes (MDPs) to realistic problems is how the complexity of planning scales with the size of the MDP. In stochastic environments with very...
ATTac-2000: An Adaptive Autonomous Bidding Agent (2001)
Peter Stone, Michael L. Littman, Satinder Singh, Michael Kearns
The First Trading Agent Competition (TAC) was held from June 22nd to July 8th, 2000. TAC was designed to create a benchmark problem in the complex domain of emarketplaces and to motivate researchers...
Graphical Models for Game Theory (2001)
Michael Kearns, Michael L. Littman, Satinder Singh
We introduce a compact graph-theoretic representation for multi-party game theory. Our main result is a provably correct and efficient algorithm for computing approximate Nash equilibria in one-stage...
A Sparse Sampling Algorithm for Near-Optimal Planning in Large Markov Decision Processes (2001)
Michael Kearns, Yishay Mansour, Andrew Y. Ng
An issue that is critical for the application of Markov decision processes (MDPs) to realistic problems is how the complexity of planning scales with the size of the MDP. In stochastic environments...
Graphical Models for Game Theory (2001)
Michael Kearns, Michael L. Littman, Satinder Singh
We introduce a compact graph-theoretic representation for multi-party game theory. Our main result is a provably correct and efficient algorithm for computing approximate Nash equilibria in...
Graphical Models for Game Theory (2001)
Michael Kearns, Michael L. Littman, Satinder Singh
We introduce a compact graph-theoretic representation for multi-party game theory. Our main result is a provably correct and efficient algorithm for computing approximate Nash equilibria in...
Efficient Noise-Tolerant Learning From Statistical Queries (2001)
this paper, we study the extension of Valiant's learning model [32] in which the positive or negative classification label provided with each random example may be corrupted by random noise. This...
Yoav Freund, Michael Kearns, Yishay Mansour, Dana Ron, Ronitt Rubinfeld, Robert E. Schapire
this paper, games are played by two players. One will be called the adversary and the other the strategy learning algorithm. We think of the adversary as a fixed resource-bounded computational...
On the Boosting Ability of Top-Down (2001)
Michael Kearns, Yishay Mansour
We analyze the performance of top-down algorithms for decision tree learning, such as those employed by the widely used C4.5 and CART software packages. Our main result is a proof that such...
Boosting Theory Towards Practice: (2001)
or missing attributes in the data, the existence of a small "target" tree, and so on), computational learning theory has so far not provided Here we are in the PAC model, where there is no noise in...
Cryptographic Limitations on Learning (2001)
Michael Kearns, Leslie Valiant
In this paper we prove the intractability of learning several classes of Boolean functions in the distribution-free model (also called the Probably Approximately Correct or PAC model) of learning...
Exact Inference of Hidden Structure (2001)
Michael Kearns, Yishay Mansour
this paper, we will restrict our attention to distributions obtained by setting all of the p i to some common value p, and we shall denote the resulting joint distribution on the outputs Y j by N p
A Bound on the Error of Cross Validation Using the Approximation (2001)
We give an analysis of the generalization error of cross validation in terms of two natural measures of the difficulty of the problem under consideration: the approximation rate (the accuracy to...
A General Lower Bound on the Number (2001)
Andrzej Ehrenfeucht, David Haussler, Michael Kearns
We prove a lower bound of OmegaGamma ffl ) on the number of random examples required for distribution-free learning of a concept class C, where VCdim(C) is the Vapnik-Chervonenkis dimension and ffl...
On the Learnability of Boolean Formulae (2001)
Michael Kearns, Ming Li, Leonard Pitt
this paper fall into three categories: closure properties of learnable classes, negative results, and distribution-specific positive results
Efficient Reinforcement Learning in Factored MDPs (2001)
We present a provably efficient and near-optimal algorithm for reinforcement learning in Markov decision processes (MDPs) whose transition model can be factored as a dynamic Bayesian network (DBN).
An Experimental and Theoretical Comparison (2001)
Michael Kearns, Yishay Mansour, Andrew Y. Ng, Dana Ron
this paper is to provide such a comparison, and more importantly, to describe the general conclusions to which it has led. Relying on evidence that is divided between controlled experimental results...
Recent Results on Boolean Concept Learning (2001)
Michael Kearns, Ming Li, Leonard Pitt
Recently, a new formal model of learnability was introduced [23]. The model is applicable to practical learning systems because it requires the learning algorithm to be feasibly computable, yet at...
Learning From a Population of Hypotheses (2001)
Michael Kearns, H. Sebastian Seung
We introduce a new formal model in which a learning algorithm must combine a collection of potentially poor but statistically independent hypothesis functions in order to approximate an unknown...
Weakly Learning DNF and Characterizing Statistical (2001)
Avrim Blum, Merrick Furst, Jeffrey Jackson, Michael Kearns, Yishay Mansour, Steven Rudich
well-studied problem of learning disjunctive normal form (DNF) expressions.
Testing Problems with Sub-Learning Sample Complexity (2001)
We study the problem of determining, for a class of functions H, whether an unknown target function f is contained in H or is "far" from any function in H. Thus, in contrast to problems of learning,...
Michael Kearns, Yishay Mansour, Dana Ron, Ronitt Rubinfeld, Robert E. Schapire, Linda Sellie
this paper, we concentrate on discrete
Algorithmic Stability and Sanity-Check Bounds (2001)
In this paper we prove sanity-check bounds for the error of the leave-one-out crossvalidation estimate of the generalization error: that is, bounds showing that the worst-case error of this estimate...
Learning in the Presence of Malicious Errors (2001)
In this paper we study an extension of the distribution-free model of learning introduced by Valiant [23] (also known as the probably approximately correct or PAC model) that allows the presence of...
Cryptographic Primitives Based on Hard Learning Problems (2001)
Avrim Blum, Merrick Furst, Michael Kearns, Richard J. Lipton
Modern cryptography has had considerable impact on the development of computational learning theory. Tools from cryptography have been used in proving nearly all of the strong negative results for...
A Social Reinforcement Learning Agent (2001)
Michael Kearns, Satinder Singh, Peter Stone
We report on our reinforcement learning work on Cobot, a software agent that resides in the well-known online chat community LambdaMOO. Our initial work on Cobot (Isbell et al., 2000) provided him...
A Social Reinforcement Learning Agent (2001)
Charles Lee Isbell, Jr. Christian, R. Shelton, Michael Kearns, Satinder Singh, Peter Stone
We report on our reinforcement learning work on Cobot, a software agent that resides in the well-known online chat community LambdaMOO. Our initial work on Cobot (Isbell et al., 2000) provided him...
A Fast, Bottom-Up Decision Tree Pruning Algorithm with Near-Optimal Generalization (2000)
Michael Kearns, Yishay Mansour
In this work, we present a new bottom-up algorithm for decision tree pruning that is very efficient (requiring only a single pass through the given tree), and prove a strong performance guarantee for...
Exact Inference of Hidden Structure from Sample Data in Noisy-OR Networks (2000)
Michael Kearns, Yishay Mansour
this paper, we will restrict our attention to distributions obtained by setting all of the p i to some common value p, and we shall denote the resulting joint distribution on the outputs Y j by N p .
ATTac-2000: An Adaptive Autonomous Bidding Agent (2000)
Peter Stone, Michael L. Littman, Satinder Singh, Michael Kearns
The First International Trading Agent Competition (TAC) was held in Boston on July 8th, 2000. TAC was designed to create a benchmark problem in the complex domain of e-marketplaces and to motivate...
Efficient Reinforcement Learning in Factored MDPs (2000)
We present a provably efficient and near-optimal algorithm for reinforcement learning in Markov decision processes (MDPs) whose transition model can be factored as a dynamic Bayesian network (DBN).
Empirical Evaluation of a Reinforcement Learning Spoken Dialogue System (2000)
Satinder Singh, Michael Kearns, Diane J. Litman, Marilyn A. Walker
We report on the design, construction and empirical evaluation of a large-scale spoken dialogue system that optimizes its performance via reinforcement learning on human user dialogue data....
Fast Planning in Stochastic Games (2000)
Michael Kearns, Yishay Mansour, Satinder Singh
Stochastic games generalize Markov decision processes (MDPs) to a multiagent setting by allowing the state transitions to depend jointly on all player actions, and having rewards determined by...
Nash Convergence of Gradient Dynamics in General-Sum Games (2000)
Satinder Singh, Michael Kearns, Yishay Mansour
Multi-agent games are becoming an increasingly prevalent formalism for the study of electronic commerce and auctions. The speed at which transactions can take place and the growing complexity of...
Bias-Variance Error Bounds for Temporal Difference Updates (2000)
Michael Kearns, Satinder Singh
Temporal difference (td) algorithms are used in reinforcement learning to compute estimates of the value of a given policy in an unknown Markov decision process (policy evaluation). We give rigorous...
Empirical Evaluation of a Reinforcement Learning Spoken Dialogue System (2000)
Satinder Singh, Michael Kearns, Diane J. Litman, Marilyn A. Walker
We report on the design, construction and empirical evaluation of a large-scale spoken dialogue system that optimizes its performance via reinforcement learning on human user dialogue data....
Reinforcement Learning for Spoken Dialogue Systems (2000)
Satinder Singh, Michael Kearns, Diane Litman, Marilyn Walker
Recently, a number of authors have proposed treating dialogue systems as Markov decision processes (MDPs). However, the practical application of MDP algorithms to dialogue systems faces a number of...
Approximate Planning in Large POMDPs via Reusable Trajectories (2000)
Michael Kearns, Yishay Mansour, Andrew Y. Ng
We consider the problem of reliably choosing a near-best strategy from a restricted class of strategies in a partially observable Markov decision process (POMDP). We assume we are given the ability...
A Sparse Sampling Algorithm for Near-Optimal (2000)
Michael Kearns, Yishay Mansour, Andrew Y. Ng
An issue that is critical for the application of Markov decision processes (MDPs) to realistic problems is how the complexity of planning scales with the size of the MDP. In stochastic environments...
Value planning assistant :--assistance for the development of project design objectives. (2000)
Thesis (M.S.)--Worcester Polytechnic Institute.
Approximate Planning in Large POMDPs via Reusable Trajectories (1999)
Michael Kearns, Yishay Mansour, Andrew Y. Ng
We consider the problem of choosing a near-best strategy from a restricted class of strategies Pi in a partially observable Markov decision process (POMDP). We assume we are given the ability to...
"Bias-Variance" Error Bounds for Temporal Difference Updates (1999)
Michael Kearns, Satinder Singh
We give the first rigorous upper bounds on the error of temporal difference (td) algorithms for policy evaluation as a function of the amount of experience. These upper bounds prove exponentially...
Approximate Planning in Large POMDPs via Reusable Trajectories (1999)
Michael Kearns, Yishay Mansour, Andrew Y. Ng
We consider the problem of reliably choosing a near-best strategy from a restricted class of strategies in a partially observable Markov decision process (POMDP). We assume we are given the ability...
Reinforcement Learning for (1999)
Satinder Singh, Michael Kearns, Diane Litman, Marilyn Walker
Recently, a number of authors have proposed treating dialogue systems as Markov decision processes (MDPs). However, the practical application of MDP algorithms to dialogue systems faces a number of...
Reinforcement Learning for Spoken Dialogue Systems (1999)
Satinder Singh, Michael Kearns, Diane Litman, Marilyn Walker
Recently, a number of authors have proposed treating dialogue systems as Markov decision processes (MDPs). However, the practical application of MDP algorithms to dialogue systems faces a number of...
Near-Optimal Reinforcement Learning in Polynomial Time (1999)
Michael Kearns, Satinder Singh
We present new algorithms for reinforcement learning, and prove that they have polynomial bounds on the resources required to achieve near-optimal return in general Markov decision processes. After...
Reinforcement Learning for Spoken Dialogue Systems (1999)
Satinder Singh, Michael Kearns, Diane Litman, Marilyn Walker
Recently, a number of authors have proposed treating dialogue systems as Markov decision processes (MDPs). However, the practical application of MDP algorithms to dialogue systems faces a number of...
Reinforcement Learning for Spoken Dialogue Systems (1999)
Satinder Singh, Michael Kearns, Diane Litman, Marilyn Walker
Recently, a number of authors have proposed treating dialogue systems as Markov decision processes (MDPs). However, the practical application of MDP algorithms to dialogue systems faces a number of...
A Sparse Sampling Algorithm for Near-Optimal Planning in Large Markov Decision Processes (1999)
Michael Kearns, Yishay Mansour, Andrew Y. Ng
An issue that is critical for the application of Markov decision processes (MDPs) to realistic problems is how the complexity o