Sanjoy Dasgupta

Random projection trees for vector quantization (2008)

Dasgupta, Sanjoy, Freund, Yoav

A simple and computationally efficient scheme for tree-structured vector quantization is presented. Unlike previous methods, its quantization error depends only on the intrinsic dimension of the data...

A Probabilistic Analysis of EM for Mixtures of Separated, Spherical Gaussians (2007)

Dasgupta, Sanjoy, Schulman, Leonard

We show that, given data from a mixture of k well-separated spherical Gaussians in ℜd, a simple two-round variant of EM will, with high probability, learn the parameters of the Gaussians to...

Analysis of Perceptron-Based Active Learning (2005)

Dasgupta, Sanjoy, Kalai, Adam Tauman, Monteleoni, Claire

We start by showing that in an active learning setting, the Perceptron algorithm needs $\Omega(\frac{1}{\epsilon^2})$ labels to learn linear separators within generalization error $\epsilon$. We then...

Analysis of Perceptron-Based Active Learning (2005)

Dasgupta, Sanjoy, Kalai, Adam Tauman, Monteleoni, Claire

We start by showing that in an active learning setting, the Perceptron algorithm needs $\Omega(\frac{1}{\epsilon^2})$ labels to learn linear separators within generalization error $\epsilon$. We then...

An Iterative Improvement Procedure for Hierarchical Clustering (2004)

David Kauchak, Sanjoy Dasgupta

We describe a procedure which finds a hierarchical clustering by hillclimbing.

A Theoretical Analysis of Query Selection for Collaborative (2002)

Sanjoy Dasgupta, Wee Sun Lee, Philip M. Long

We consider the problem of determining which of a set of experts has tastes most similar to a given user by asking the user questions about his likes and dislikes. We describe a simple algorithm for...

An Ecient PAC Algorithm for Reconstructing (2002)

Sanjoy Dasgupta, Elan Pavlov, Yoram Singer

In this paper we study the learnability of a mixture of lines model which is of great importance in machine vision, computer graphics, and computer aided design applications. The mixture of lines is...

The Complexity of Approximating the Entropy (2002)

Sanjoy Dasgupta, Ravi Kumar, Ronitt Rubinfeld

We consider the problem of approximating the entropy of a discrete distribution under several models. If the distribution is given explicitly as an array where the i-th location is the probability of...

A Generalization of Principal Component Analysis to the Exponential Family (2002)

Michael Collins, Sanjoy Dasgupta, Robert E. Schapire

Principal component analysis (PCA) is a commonly applied technique for dimensionality reduction. PCA implicitly minimizes a squared loss function, which may be inappropriate for data that is not...

A Generalization of Principal Component Analysis to the Exponential Family (2002)

Michael Collins, Sanjoy Dasgupta, Robert E. Schapire

Principal component analysis (PCA) is a commonly applied technique for dimensionality reduction. PCA implicitly minimizes a squared loss function, which may be inappropriate for data that is not...

A Generalization of Principal Component Analysis to the Exponential Family (2001)

Michael Collins, Sanjoy Dasgupta, Robert E. Schapire

Principal component analysis (PCA) is a commonly applied technique for dimensionality reduction. PCA implicitly minimizes a squared loss function, which may be inappropriate for data that is not...

Off-Policy Temporal-Difference Learning with Function Approximation (2001)

Doina Precup, Richard S. Sutton, Sanjoy Dasgupta

We introduce the first algorithm for off-policy temporal-difference learning that is stable with linear function approximation. Off-policy learning is of interest because it forms the basis for...

Learning probability distributions /--by Sanjoy Dasgupta. (2000)

Dasgupta, Sanjoy.

Thesis (Ph. D. in Computer Science)--University of California, Berkeley, Spring 2000.

An elementary proof of the Johnson-Lindenstrauss Lemma (1999)

Sanjoy Dasgupta, Anupam Gupta

The Johnson-Lindenstrauss lemma shows that a set of n points in high dimensional Euclidean space can be mapped down into an O(log n=ffl 2 ) dimensional Euclidean space such that the distance between...

An elementary proof of the Johnson-Lindenstrauss Lemma (1999)

Sanjoy Dasgupta, Anupam Gupta

The Johnson-Lindenstrauss lemma shows that a set of n points in high dimensional Euclidean space can be mapped down into an O(log n=ffl 2 ) dimensional Euclidean space such that the distance between...

Cache Behaviour of the SPEC95 Benchmark Suite (1996)

Sanjoy Dasgupta, Edouard Servan-schreiber

Previous cache studies are becoming rather outdated, since they deal with older benchmarks (such as those in SPEC92) and smaller cache configurations than are now feasible. We remedy this situation...