Bayesian Reinforcement Learning for Coalition Formation under Uncertainty (2004)
Georgios Chalkiadakis, Craig Boutilier
Research on coalition formation usually assumes the values of potential coalitions to be known with certainty. Furthermore, settings in which agents lack sufficient knowledge of the capabilities of...
Nevin L. Zhang, Craig Boutilier
Latent class models are used for cluster analysis of categorical data. Underlying such a model is the assumption that the observed variables are mutually independent given the class variable. A...
Bayesian Reinforcement Learning for Coalition Formation under Uncertainty (2004)
Georgios Chalkiadakis, Craig Boutilier
Research on coalition formation usually assumes the values of potential coalitions to be known with certainty. Furthermore, settings in which agents lack sufficient knowledge of the capabilities of...
Regret Minimizing Equilibria and Mechanisms for Games with Strict Type Uncertainty (2004)
Nathanael Hyafil, Craig Boutilier
Mechanism design has found considerable application to the construction of agent-interaction protocols. In the standard setting, the type (e.g., utility function) of an agent is not known by other...
Bayesian Reinforcement Learning for Coalition Formation under Uncertainty (2004)
Georgios Chalkiadakis, Craig Boutilier
Research on coalition formation usually assumes the values of potential coalitions to be known with certainty. Furthermore, settings in which agents lack sufficient knowledge of the capabilities of...
Regret-based Utility Elicitation in Constraint-based Decision Problems (2004)
Craig Boutilier, Relu Patrascu, Pascal Poupart, Dale Schuurmans
Constraint-based optimization requires the formulation of a precise objective function. However, in many circumstances, the objective is to maximize the utility of a specific user among the space of...
Stochastic Local Search for POMDP Controllers (2004)
Darius Braziunas, Craig Boutilier
The search for finite-state controllers for partially observable Markov decision processes (POMDPs) is often based on approaches like gradient ascent, attractive because of their relatively low...
Eliciting Bid Taker Non-price Preferences in (Combinatorial) Auctions (2004)
Craig Boutilier, Tuomas Sandholm, Rob Shields
Recent algorithms provide powerful solutions to the problem of determining cost-minimizing (or revenue-maximizing) allocations of items in combinatorial auctions. However, in many settings, criteria...
Eliciting Bid Taker Non-price Preferences in (Combinatorial) Auctions (2004)
Craig Boutilier, Tuomas Sandholm, Rob Shields
Recent algorithms provide powerful solutions to the problem of determining cost-minimizing (or revenue-maximizing) allocations of items in combinatorial auctions. However, in many settings, criteria...
Craig Boutilier, Ronen I. Brafman, Holger H. Hoos, David Poole
Information about user preferences plays a key role in automated decision making. In many domains it is desirable to assess such preferences in a qualitative rather than quantitative way. In this...
Craig Boutilier, Ronen I. Brafman, Holger H. Hoos, David Poole
Information about user preferences plays a key role in automated decision making. In many domains it is desirable to assess such preferences in a qualitative rather than quantitative way. In this...
Imitation can be viewed as a means of enhancing learning in multiagent environments.
Bounded Finite State Controllers (2003)
Pascal Poupart, Craig Boutilier
We describe a new approximation algorithm for solving partially observable MDPs. Our bounded policy iteration approach searches through the space of bounded-size, stochastic finite state controllers,...
Imitation can be viewed as a means of enhancing learning in multiagent environments.
Craig Boutilier, Ronen I. Brafman, Holger H. Hoos, David Poole
Information about user preferences plays a key role in automated decision making. In many domains it is desirable to assess such preferences in a qualitative rather than quantitative way. In this...
Junling Hu, Michael P. Wellman, Craig Boutilier
We extend Q-learning to a noncooperative multiagent context, using the framework of generalsum stochastic games. A learning agent maintains Q-functions over joint actions, and performs updates based...
Robert Castelo, Tomas Kocka, Craig Boutilier
Two or more Bayesian network structures are Markov equivalent when the corresponding acyclic digraphs encode the same set of conditional independencies. Therefore, the search space of Bayesian...
Constraint-based Optimization with the Minimax (2003)
Craig Boutilier, Relu Patrascu, Pascal Poupart, Dale Schuurmans
In many situations, a set of hard constraints encodes the feasible configurations of some system or product over which users have preferences. We consider the problem of computing a best feasible...
Robert Castelo, Tomas Kocka, Craig Boutilier
Two or more Bayesian network structures are Markov equivalent when the corresponding acyclic digraphs encode the same set of conditional independencies. Therefore, the search space of Bayesian...
Cooperative Negotiation in Autonomic Systems using Incremental Utility (2003)
Craig Boutilier, Rajarshi Das, Jeffrey O. Kephart, Gerald Tesauro, William E. Walsh
Decentralized resource allocation is a key problem for large-scale autonomic (or self-managing) computing systems. Motivated by a data center scenario, we explore efficient techniques for resolving...
Active Collaborative Filtering (2003)
Craig Boutilier, Richard S. Zemel, Benjamin Marlin
Collaborative filtering (CF) allows the preferences of multiple users to be pooled to make recommendations regarding unseen products. We consider in this paper the problem of online and interactive...
Value-Directed Compression of POMDPs (2003)
Pascal Poupart, Craig Boutilier
We examine the problem of generating state-space compressions of POMDPs in a way that minimally impacts decision quality. We analyze the impact of compressions on decision quality, observing that...
Coordination in Multiagent Reinforcement Learning: A (2003)
Georgios Chalkiadakis, Craig Boutilier
Much emphasis in multiagent reinforcement learning (MARL) research is placed on ensuring that MARL algorithms (eventually) converge to desirable equilibria. As in standard reinforcement learning,...
A Bayesian Approach to Imitation in Reinforcement Learning (2003)
In multiagent environments, forms of social learning such as teaching and imitation have been shown to aid the transfer of knowledge from experts to learners in reinforcement learning (RL). We recast...
Incremental Utility Elicitation with the Minimax Regret Decision Criterion (2003)
Utility elicitation is a critical function of any automated decision aid, allowing decisions to be tailored to the preferences of a specific user. However, the size and complexity of utility...
On the Foundations of Expected Expected Utility (2003)
Intelligent agents often need to assess user utility functions in order to make decisions on their behalf, or predict their behavior. When uncertainty exists over the precise nature of this utility...
Craig Boutilier, Ronen I. Brafman, Holger H. Hoos, David Poole
In many domains it is desirable to assess the preferences of users in a qualitative rather than quantitative way. Useful representations of qualitative preference orderings form an important...
Preference-based Constrained Optimization with CP-nets (2003)
Craig Boutilier, Ronen I. Brafman, Holger H. Hoos, David Poole
Many AI tasks, such as product configuration, decision support, and the construction of autonomous agents, involve a process of constrained optimization, that is, optimization of behavior or choices...
Online Queries for Collaborative Filtering (2003)
Craig Boutilier, Richard S. Zemel
Collaborative filtering allows the preferences of multiple users to be pooled in a principled way in order to make recommendations about products, services or information unseen by a specific user.
Coordination in Multiagent Reinforcement Learning: A Bayesian Approach (2003)
Georgios Chalkiadakis, Craig Boutilier
Much emphasis in multiagent reinforcement learning (MARL) research is placed on ensuring that MARL algorithms (eventually) converge to desirable equilibria. As in standard reinforcement learning,...
Value-directed Compression of POMDPs (2003)
Pascal Poupart, Craig Boutilier
We examine the problem of generating state-space compressions of POMDPs in a way that minimally impacts decision quality. We analyze the impact of compressions on decision quality, observing that...
David Maxwell Chickering, Craig Boutilier
In this paper we prove the so-called "Meek Conjecture". In particular, we show that if a is an independence map of another DAG then there exists a finite sequence of edge additions and covered edge...
Solving Concisely Expressed Combinatorial Auction Problems (2002)
Combinatorial auctions provide a valuable mechanism for the allocation of goods in settings where buyer valuations exhibit complex structure with respect to substitutability and complementarity. Most...
Greedy linear value-approximation for factored Markov decision processes (2002)
Relu Patrascu, Pascal Poupart, Dale Schuurmans, Craig Boutilier, Carlos Guestrin
Significant recent work has focused on using linear representations to approximate value functions for factored Markov decision processes (MDPs). Current research has adopted linear programming as an...
Piecewise Linear Value Function Approximation for Factored MDPs (2002)
Pascal Poupart, Craig Boutilier, Relu Patrascu, Dale Schuurmans
A number of proposals have been put forth in recent years for the solution of Markov decision processes (MDPs) whose state (and sometimes action) spaces are factored.
A POMDP Formulation of Preference Elicitation Problems (2002)
Preference elicitation is a key problem facing the deployment of intelligent systems that make or recommend decisions on the behalf of users. Since not all aspects of a utility function have the same...
Context-Specific Independence in Bayesian Networks (2002)
Craig Boutilier, Nir Friedman, Moises Goldszmidt, Daphne Koller
Bayesiannetworks provide a languagefor qualitatively representing the conditional independence properties of a distribution. This allows a natural and compact representation of the distribution,...
David Maxwell Chickering, Craig Boutilier
Two Bayesian-network structures are said to be equivalent if the set of distributions that can be represented with one of those structures is identical to the set of distributions that can be...
Symbolic Dynamic Programming for First-Order MDPs (2002)
Craig Boutilier, Ray Reiter, Bob Price
We present a dynamic programming approach for the solution of first-order Markov decisions processes.
Bidding Languages for Combinatorial Auctions (2001)
Craig Boutilier, Holger H. Hoos
Combinatorial auctions provide a valuable mechanism for the allocation of goods in settings where buyer valuations exhibit complex structure with respect to substitutability and complementarity. Most...
UCP-Networks: A Directed Graphical Representation of Conditional Utilities (2001)
Craig Boutilier, Ronen I. Brafman
We propose a directed graphical representation of utility functions, called UCP-networks, that combines aspects of two existing preference models: generalized additive models and CP-networks. The...
Value-Directed Sampling Methods for Monitoring POMDPs (2001)
Pascal Poupart, Luis E. Ortiz, Craig Boutilier
We consider the problem of approximate belief-state monitoring using particle filtering for the purposes of implementing a policy for a partially observable Markov decision process (POMDP). While...
Vector-space Analysis of Belief-state Approximation for POMDPs (2001)
Pascal Poupart, Craig Boutilier
We propose a new approach to value-directed belief state approximation for POMDPs. The valuedirected model allows one to choose approximation methods for belief state monitoring that have a small...
Partial-Order Planning with Concurrent Interacting Actions (2001)
Craig Boutilier, Ronen I. Brafman
In order to generate plans for agents with multiple actuators, agent teams, or distributed
Symbolic Dynamic Programming for First-Order MDPs (2001)
Craig Boutilier, Ray Reiter, Bob Price
We present a dynamic programming approach for the solution of first-order Markov decisions processes. This technique uses an MDP whose dynamics is represented in a variant of the situation calculus...
APRICODD: Approximate Policy Construction using Decision Diagrams (2001)
Robert St-aubin, Jesse Hoey, Craig Boutilier
We propose a method of approximate dynamic programming for Markov decision processes (MDPs) using algebraic decision diagrams (ADDs). We produce near-optimal value functions and policies with much...
SPUDD: Stochastic Planning using Decision Diagrams (2001)
Jesse Hoey, Robert St-aubin, Alan Hu, Craig Boutilier
Recently, structured methods for solving factored Markov decisions processes (MDPs) with large state spaces have been proposed recently to allow dynamic programming to be applied without the need for...
Partially-observable Markov decision processes provide a very general model for decision-theoretic planning problems, allowing the trade-offs between various courses of actions to be determined under...
APRICODD: Approximate Policy Construction using Decision Diagrams (2000)
Robert St-aubin, Jesse Hoey, Craig Boutilier
We propose a method of approximate dynamic programming for Markov decision processes (MDPs) using algebraic decision diagrams (ADDs). We produce near-optimal value functions and policies with much...
Reasoning With Conditional Ceteris Paribus Preference Statements (2000)
Craig Boutilier, Ronen I. Brafman, Holger H. Hoos, David Poole
In many domains it is desirable to assess the preferences of users in a qualitative rather than quantitative way. Such representations of qualitative preference orderings form an important component...
Decision-Theoretic Planning: Structural Assumptions and Computational Leverage (2000)
Craig Boutilier, Thomas Dean, Steve Hanks
Planning under uncertainty is a central problem in the study of automated sequential decision making, and has been addressed by researchers in many different fields, including AI planning, decision...
APRICODD: Approximate Policy Construction using Decision Diagrams (2000)
Robert St-aubin, Jesse Hoey, Craig Boutilier
We propose a method of approximate dynamic programming for Markov decision processes (MDPs) using algebraic decision diagrams (ADDs). We produce near-optimal value functions and policies with much...
Implicit Imitation in Multiagent Reinforcement Learning (2000)
Imitation is actively being studied as an effective means of learning in multi-agent environments.
Decision-Theoretic, High-level Agent Programming in the Situation Calculus (2000)
Craig Boutilier, Ray Reiter, Mikhail Soutchanski, Sebastian Thrun
We propose a framework for robot programming which allows the seamless integration of explicit agent programming with decision-theoretic planning. Specifically, the DTGolog model allows one to...
Partial-Order Planning with Concurrent Interacting Actions (2000)
Craig Boutilier, Ronen I. Brafman
In order to generate plans for agents with multiple actuators, agent teams, or distributed controllers, we must be able to represent and plan using concurrent actions with interacting effects. This...
Optimal and Approximate Stochastic Planning using Decision Diagrams (2000)
Jesse Hoey, Robert St-aubin, Alan Hu, Craig Boutilier
Structured methods for solving factored Markov decision processes (MDPs) with large state spaces have been proposed recently to allow dynamic programming to be applied without the need for complete...
Nondeterministic Actions and the Frame Problem (2000)
We describe a logical system and methodology for the natural specification of nondeterministic actions. The logic combines elements of dynamic logic, process logic and the situation calculus and...
Value-Directed Belief State Approximation for POMDPs (2000)
Pascal Poupart, Craig Boutilier
We consider the problem belief-state monitoring for the purposes of implementing a policy for a partially-observable Markov decision process (POMDP), specifically how one might approximate the belief...
Approximately Optimal Monitoring of Plan Preconditions (2000)
Monitoring plan preconditions can allow for replanning when a precondition fails, generally far in advance of the point in the plan where the precondition is relevant. However, monitoring is...
Imitation and Reinforcement Learning with Heterogeneous Actions (2000)
We study the problem of accelerating reinforcement learning through the observation and implicit imitation of expert agents (mentors) acting in the same domain. In this paper, we consider problems...
Stochastic Dynamic Programming with Factored Representations (2000)
Craig Boutilier, Richard Dearden, Moises Goldszmidt
Markov decision processes(MDPs) haveproven to be popular models for decision-theoretic planning, but standard dynamic programming algorithms for solving MDPs rely on explicit, state-based...
Stochastic Dynamic Programming with Factored Representations (2000)
Craig Boutilier, Richard Dearden, Moises Goldszmidt
Markov decision processes(MDPs) haveproven to be popular models for decision-theoretic planning, but standard dynamic programming algorithms for solving MDPs rely on explicit, state-based...
Implicit Imitation in Multiagent Reinforcement Learning (2000)
Imitation is actively being studied as an effective means of learning in multi-agent environments. It allows an agent to learn how to act well (perhaps optimally) by passively observing the actions...
Imitation and Reinforcement Learning in Agents with Heterogeneous Actions (2000)
We study the problem of accelerating reinforcement learning (RL) through the observation and implicit imitation of expert agents (mentors) acting in the same domain. In this paper, we consider...
Decision-Theoretic, High-level Agent Programming in the Situation Calculus (2000)
Craig Boutilier, Ray Reiter, Mikhail Soutchanski, Sebastian Thrun
We proposea framework for robot programming which allows the seamless integration of explicit agent programming with decision-theoretic planning. Specifically, the DTGolog model allows one to...
Solving Combinatorial Auctions using Stochastic Local Search (2000)
Holger H. Hoos, Craig Boutilier
Combinatorial auctions (CAs) have emerged as an important model in economics and show promise as a useful tool for tackling resource allocation in AI. Unfortunately, winner determination for CAs is...
Decision-Theoretic, High-level Agent Programming in the Situation Calculus (2000)
Craig Boutilier, Ray Reiter, Mikhail Soutchanski, Sebastian Thrun
We proposea framework for robot programming which allows the seamless integration of explicit agent programming with decision-theoretic planning. Specifically, the DTGolog model allows one to...
Context-Specific Independence in Bayesian Networks (1999)
Craig Boutilier, Nir Friedman, Moises Goldszmidt, Daphne Koller
Bayesiannetworks provide a languagefor qualitatively representing the conditional independence properties of a distribution. This allows a natural and compact representation of the distribution,...
Reasoning With Conditional Ceteris Paribus Preference Statements (1999)
Craig Boutilier, Ronen I. Brafman, Holger H. Hoos, David Poole
In many domains it is desirable to assess the preferences of users in a qualitative rather than quantitative way. Such representations of qualitative preference orderings form an important component...
Decision-Theoretic Planning: Structural Assumptions and Computational Leverage (1999)
Craig Boutilier, Thomas Dean, Steve Hanks
Planning under uncertainty is a central problem in the study of automated sequential decision making, and has been addressed by researchers in many different fields, including AI planning, decision...
Decision-Theoretic Planning: Structural Assumptions and Computational Leverage (1999)
Craig Boutilier, Thomas Dean, Steve Hanks
Planning under uncertainty is a central problem in the study of automated sequential decision making, and has been addressed by researchers in many different fields, including AI planning, decision...
Abstraction and Approximate Decision Theoretic Planning (1999)
Richard Dearden, Craig Boutilier
ion and Approximate Decision Theoretic Planning Richard Dearden and Craig Boutilier y Department of Computer Science University of British Columbia Vancouver, British Columbia CANADA, V6T 1Z4 email:...
Continuous Value Function Approximation for Sequential Bidding Policies (1999)
Craig Boutilier, Moises Goldszmidt, Bikash Sabata
Market-based mechanisms such as auctions are being studied as an appropriate means for resource allocation in distributed and multiagent decision problems. When agents value resources in combination...
Belief Revision with Unreliable Observations (1999)
Craig Boutilier, Nir Friedman, Joseph Y. Halpern
Research in belief revision has been dominated by work that lies firmly within the classic AGM paradigm, characterized by a well-known set of postulates governing the behavior of "rational " revision...
Belief Revision with Unreliable Observations (1999)
Craig Boutilier, Nit Friedman, Joseph Y. Halpern
Research in belief revision has been dominated by work that lies firmly within the classic AGM paradigm, characterized by a well-known set of postulates governing the behavior of "rational " revision...
Implicit Imitation in Multiagent Reinforcement Learning (1999)
Imitation is actively being studied as an effective means of learning in multi-agent environments. It allows an agent to learn how to act well (perhaps optimally) by passively observing the actions...
Sequential Auctions for the Allocation of Resources with Complementarities (1999)
Craig Boutilier, Moises Goldszmidt, Bikash Sabata
Market-based mechanisms such as auctions are being studied as an appropriate means for resource allocation in distributed and multiagent decision problems. When agents value resources in combination...
Continuous Value Function Approximation for Sequential Bidding Policies (1999)
Craig Boutilier, Moises Goldszmidt, Bikash Sabata
Market-based mechanisms such as auctions are being studied as an appropriate means for resource allocation in distributed and multiagent decision problems. When agents value resources in combination...
SPUDD: Stochastic Planning using Decision Diagrams (1999)
Jesse Hoey, Robert St-aubin, Alan Hu, Craig Boutilier
Recently, structured methods for solving factored Markov decisions processes (MDPs) with large state spaces have been proposed recently to allow dynamic programming to be applied without the need for...
Continuous Value Function Approximation for Sequential Bidding Policies (1999)
Craig Boutilier, Moises Goldszmidt, Bikash Sabata
Market-based mechanisms such as auctions are being studied as an appropriate means for resource allocation in distributed and multiagent decision problems. When agents value resources in combination...
Reasoning With Conditional Ceteris Paribus Preference Statements (1999)
Craig Boutilier, Ronen I. Brafman, Holger H. Hoos, David Poole
In many domains it is desirable to assess the preferences of users in a qualitative rather than quantitative way. Such representations of qualitative preference orderings form an important component...