Arjun Talwar
Advisor: Prof. Michael Littman

The Project

Games come in many varieties. Some are represented by trees, some by directed graphs, with cycles or without. Rewards in these games are either accumulated after every move (edge rewards) or come in a big bang at the end (leaf rewards). Some games have randomness (stochastic), some dont (deterministic). We are interested in all such games with two players. But also, we stay away from games where everybody doesn't know everything all the time,i.e. we're studying only games of Perfect Information. In particular, we are trying to find algorithms that find appropriate Nash Equilibria, i.e. Stable States of an arbitrary game in Polynomial time. By appropriate, I mean Equilibria with some constraints. For example- the equilibria that maximize the reward of a given player or the sum of both payoffs. The last one will be called the MaxUtil problem. There are related problems like MaxUtil for one player and Fairness.

Some Results

The set of two player perfect games by the above definitions are split into twelve, not usually disjoint classes. Here we use 'DAG' - Directed Acyclic Graph to refer to those game structures where a node is never revisited and Cyclic for those directed graph structures where a node may be revisited. Here the game might loop and loop. The twelve classes, accompanied by some of our results are:

Leaf Rewards, Deterministic Leaf Rewards, Stochastic Edge Rewards, Determinstic Edge Rewards, Stochastic
Tree Games Polytime for MaxUtil, MaxUtilA, MaxUtilB, Fairness ? Polytime by equivalence ?
DAG Games ? NP-Complete for MaxUtil, MaxUtilA, MaxUtilB, Fairness NP-Complete for MaxUtil NP-Complete for MaxUtil
Cyclic Games ? NP-Complete by extension NP-Complete by extension from above NP-Complete by extension

Some Details: The Polytime algorithm was developed by starting at the terminal nodes of the game tree, seeing which points (A's utility on one axis and B's on the other) could be part of equilibria and then moving up the tree each time adding or removing sets of points each time. The end result, in n squared time, is a complete picture of Nash Equilibrium outcomes, including probabilistic mixtures between different outcomes based on one players indifference (simply represented by a continuum, i.e. a line parallel to one of the axes)

The NP-Complete ones are reducible to the 3-SAT Problem. These results were obtained largely with the help of Marty,a post doc from Brown.
Some things about simple cycle games here
Some things to think about: Can we find an equivalent algorithm for stochastic tree games as the deterministic one? Perhaps by weighting payoffs with probabilities and what else?

Hindi Mein          Po-Russki

Other Research