Kesheng Wu

Two Strategies to Speed up Connected Component Labeling Algorithms (2008)

Wu, Kesheng, Otoo, Ekow, Suzuki, Kenji

This paper presents two new strategies to speed up connected component labeling algorithms. The first strategy employs a decision treeto minimize the work performed in the scanning phase of connected...

Breaking the Curse of Cardinality on Bitmap Indexes (2008)

Wu, Kesheng

Bitmap indexes are known to be efficient for ad-hoc range queries that are common in data warehousing and scientific applications. However, they suffer from the curse of cardinality, that is, their...

Performances of Multi-Level and Multi-Component Compressed Bitmap Indices (2007)

Wu, Kesheng, Stockinger, Kurt, Shoshani, Arie

This paper presents a systematic study of two large subsets of bitmap indexing methods that use multi-component and multi-level encodings. Earlier studies on bitmap indexes are either empirical or...

Efficient Analysis of Live and Historical Streaming Data and its Application to Cybersecurity (2007)

Reiss, Frederick, Stockinger, Kurt, Wu, Kesheng, Shoshani, Arie, Hellerstein, Joseph M.

Applications that query data streams in order to identify trends, patterns, or anomalies can often benefit from comparing the live stream data with archived historical stream data. However, searching...

Using Bitmap Indexing Technology for Combined Numerical and Text Queries (2006)

Stockinger, Kurt, Cieslewicz, John, Wu, Kesheng, Rotem, Doron, Shoshani, Arie

In this paper, we describe a strategy of using compressed bitmap indices to speed up queries on both numerical data and text documents. By using an efficient compression algorithm, these compressed...

Detecting Distributed Scans Using High-Performance Query-Driven Visualization (2006)

Stockinger, Kurt, Bethel, E. Wes, Campbell, Scott, Dart, Eli, Wu, Kesheng

Modern forensic analytics applications, like network traffic analysis, perform high-performance hypothesis testing, knowledge discovery and data mining on very large datasets. One essential strategy...

Accelerating Network Traffic Analytics Using Query-Driven Visualization (2006)

Bethel, E. Wes, Campbell, Scott, Dart, Eli, Stockinger, Kurt, Wu, Kesheng

Realizing operational analytics solutions where large and complex data must be analyzed in a time-critical fashion entails integrating many different types of technology. This paper focuses on an...

HDF5-FastQuery: An API for Simplifying Access to Data Storage, Retrieval, Indexing and Querying (2006)

Bethel, E. Wes, Gosink, Luke, Shalf, John, Stockinger, Kurt, Wu, Kesheng

This work focuses on research and development activities that bridge a gap between fundamental data management technology index, query, storage and retrieval and use of such technology in...

High Performance Visualization using Query-Driven Visualization and Analytics (2006)

Bethel, E. Wes, Campbell, Scott, Dart, Eli, Shalf, John, Stockinger, Kurt, Wu, Kesheng

Query-driven visualization and analytics is a unique approach for high-performance visualization that offers new capabilities for knowledge discovery and hypothesis testing. The new capabilities akin...

Minimizing I/O Costs of Multi-Dimensional Queries with Bitmap Indices (2006)

Rotem, Doron, Stockinger, Kurt, Wu, Kesheng

Bitmap indices have been widely used in scientific applications and commercial systems for processing complex, multi-dimensional queries where traditional tree-based indices would not work...

Analyzing Enron Data: Bitmap Indexing Outperforms MySQL Queries by Several Orders of Magnitude (2006)

Stockinger, Kurt, Rotem, Doron, Shoshani, Arie, Wu, Kesheng

FastBit is an efficient, compressed bitmap indexing technology that was developed in our group. In this report we evaluate the performance of MySQL and FastBit for analyzing the email traffic of the...

HDF5-FastQuery: Accelerating Complex Queries on HDF Datasets Using Fast Bitmap Indices (2005)

Gosink, Luke, Shalf, John, Stockinger, Kurt, Wu, Kesheng, Bethel, Wes

Large scale scientific data is often stored in scientific data formats such as FITS, netCDF and HDF. These storage formats are of particular interest to the scientific user community since they...

Interactive Analysis of Large Network Data Collections Using Query-Driven Visualization (2005)

Bethel, E. Wes, Campbell, Scott, Dart, Eli, Lee, Jason, Smith, Steven A., Stockinger, Kurt, ...

Realizing operational analytics solutions where large and complex data must be analyzed in a time-critical fashion entails integrating many different types of technology. Considering the extreme...

Towards Optimal Multi-Dimensional Query Processing with Bitmap Indices (2005)

Rotem, Doron, Stockinger, Kurt, Wu, Kesheng

Bitmap indices have been widely used in scientific applications and commercial systems for processing complex, multi-dimensional queries where traditional tree-based indices would not work...

Network Traffic Analysis With Query Driven VisualizationSC 2005 HPC Analytics Results (2005)

Stockinger, Kurt, Wu, Kesheng, Campbell, Scott, Lau, Stephen, Fisk, Mike, Gavrilov, Eugene, ...

Our analytics challenge is is to identify, characterize, and visualize anomalous subsets of large collections of network connection data. We use a combination of HPC resources, advanced algorithms,...

Bitmap Indices for Fast End-User Physics Analysis in ROOT (2005)

Stockinger, Kurt, Wu, Kesheng, Brun, Rene, Canal, Philippe

Most physics analysis jobs involve multiple selection steps on the input data. These selection steps are called \it cuts or \it queries. A common strategy to implement these queries is to read all...

Optimizing Candidate Check Costs for Bitmap Indices (2005)

Rotem, Doron, Stockinger, Kurt, Wu, Kesheng

In this paper, we propose a new strategy for optimizing the placement of bin boundaries to minimize the cost of query evaluation using bitmap indices with binning. For attributes with a large number...

FastBit: An Efficient Indexing Technology For Accelerating Data-Intensive Science (2005)

Wu, Kesheng

FastBit is a software tool for searching large read-only data sets. It organizes user data in a column-oriented structure which is efficient for on-line analytical processing (OLAP), and utilizes...

Grid Collector: Facilitating Efficient Selective Access from Data Grids (2005)

Wu, Kesheng, Gu, Junmin, Lauret, Jerome, Poskanzer, Arthur M., Shoshani, Arie, Sim, Alexander, ...

The Grid Collector is a system that facilitates the effective analysis and spontaneous exploration of scientific data. It combines an efficient indexing technology with a Grid file management...

DEX: Increasing the Capability of Scientific Data Analysis Pipelines by Using Efficient Bitmap Indices to Accelerate Scientific Visualization (2005)

Stockinger, Kurt, Shalf, John, Bethel, Wes, Wu, Kesheng

We describe a new approach to scalable data analysis that enables scientists to manage the explosion in size and complexity of scientific data produced by experiments and simulations. Our approach...

Optimizing connected component labeling algorithms (2005)

Wu, Kesheng, Otoo, Ekow, Shoshani, Arie

This paper presents two new strategies that can be used to greatly improve the speed of connected component labeling algorithms. To assign a label to a new object, most connected component labeling...

Efficient binning for bitmap indices on high-cardinality attributes (2004)

Rotem, Doron, Stockinger, Kurt, Wu, Kesheng

Bitmap indexing is a common technique for indexing high-dimensional data in data warehouses and scientific applications. Though efficient for low-cardinality attributes, query processing can be...

Grid Collector: Using an event catalog to speed up user analysis in distributed environment (2004)

Wu, Kesheng, Shoshani, Arie, Zhang, Wei-Ming, Lauret, Jerome, Perevoztchikov, Victor

Nuclear and High Energy Physics experiments such as STAR at BNL are generating millions of files with PetaBytes of data each year. In most cases, analysis programs have to read all events in a file...

Improved searching for spatial features in spatio-temporal data (2004)

Stockinger, Kurt, Wu, Kesheng

Scientific data analysis often requires mining large databases or data warehouses to find features in space. One important task is to find regions of interest such as stellar objects in astrophysics...

On the Performance of Bitmap Indices for High Cardinality Attributes (2004)

Kesheng Wu, Ekow Otoo, Arie Shoshani

It is well established that bitmap indices are effcient for read-only attributes with low attribute cardinalities. For an attribute with a high cardinality, the size of the bitmap index can be very...

Evaluation Strategies for Bitmap Indices with Binning (2004)

Stockinger, Kurt, Wu, Kesheng, Shoshani, Arie

Bitmap indices are efficient data structures for querying read-only data with low attribute cardinalities. To improve the efficiency of the bitmap indices on attributes with high cardinalities, we...

An efficient compression scheme for bitmap indices (2004)

Wu, Kesheng, Otoo, Ekow J., Shoshani, Arie

When using an out-of-core indexing method to answer a query, it isgenerally assumed that the I/O cost dominates the overall query response time. Because of this, most research on indexing methods...

On the performance of bitmap indices for high cardinality attributes (2004)

Wu, Kesheng, Otoo, Ekow, Shoshani, Arie

It is well established that bitmap indices are efficient for read-only attributes with a small number of distinct values. For an attribute with a large number of distinct values, the size of the...

Grid collector: An event catalog with automated file management (2003)

Wu, Kesheng, Zhang, Wei-Ming, Sim, Alexander, Gu, Junmin, Shoshani, Arie

High Energy Nuclear Physics (HENP) experiments such as STAR at BNL and ATLAS at CERN produce large amounts of data that are stored as files on mass storage systems in computer centers. In these...

Using bitmap index for interactive exploration of large datasets (2003)

Wu, Kesheng, Koegler, Wendy, Chen, Jacqueline, Shoshani, Arie

Many scientific applications generate large spatio-temporal datasets. A common way of exploring these datasets is to identify and track regions of interest. Usually these regions are defined as...

Compressed Bitmap Indices for Ecient Query Processing (2002)

Kesheng Wu, Ekow J. Otoo, Arie Shoshani

Bitmap indices are useful techniques for improving access speed of high-dimensional data in data warehouses and in large scientific databases. Even though the bitmaps are easy to compress,...

Strategies for Processing ad hoc Queries on Large Data Warehouses (2002)

Kurt Stockinger, Kesheng Wu, Arie Shoshani

As data warehousing applications grow in size, existing data organizations and access strategies, such as relational tables and B-tree indexes, are becoming increasingly ine#ective. The two primary...

Strategies for Processing ad hoc Queries on Large Data (2002)

Kurt Stockinger, Kesheng Wu, Arie Shoshani

As data warehousing applications grow in size, existing data organizations and access strategies, such as relational tables and B-tree indexes, are becoming increasingly ine#ective. The two primary...

Compressing Bitmap Indexes for Faster Search Operations (2002)

Kesheng Wu, Ekow J. Otoo, Arie Shoshani

In this paper, we study the effects of compression on bitmap indexes. The main operations on the bitmaps during query processing are bitwise logical operations such as AND, OR, NOT, etc. Using the...

Compressing bitmap indexes for faster search operations (2002)

Wu, Kesheng, Otoo, Ekow J., Shoshani, Arie

In this paper, we study the effects of compression on bitmap indexes. The main operations on the bitmaps during query processing are bitwise logical operations such as AND, OR, NOT, etc. Using the...

Compressed bitmap indices for efficient query processing (2001)

Wu, Kesheng, Otoo, Ekow, Shoshani, Arie

Many database applications make extensive use of bitmap indexing schemes. In this paper, we study how to improve the efficiencies of these indexing schemes by proposing new compression schemes for...

A Block Orthogonalization Procedure With Constant Synchronization Requirements (2001)

Andreas Stathopoulos, Kesheng Wu

First, we consider the problem of orthonormalizing skinny (long) matrices. We propose an alternative orthonormalization method that computes the orthonormal basis from the right singular vectors of a...

A Revised Proposal for a Sparse BLAS Toolkit (2001)

Sandra Carney, Michael A. Heroux, Guangye Li, Roldan Pozo, Karin A. Remington, Kesheng Wu

This paper describes a proposal for a "toolkit" of kernel routines for some of the basic operations in (iterative) sparse numerical methods. In particular, we describe an interface for routines which...

A Block Orthogonalization Procedure With Constant Synchronization Requirements (2000)

Andreas Stathopoulos, Kesheng Wu

. We propose an alternative orthonormalization method that computes the orthonormal basis from the right singular vectors of a matrix. Its advantage are: a) all operations are matrix-matrix...

A Survey of Packages for Large Linear Systems (2000)

Kesheng Wu, Brent Milne

This paper evaluates portable software packages for the iterative solution of very large sparse linear systems on parallel architectures. While we cannot hope to tell individual users which package...

A Block Orthogonalization Procedure With Constant Synchronization Requirements (2000)

Andreas Stathopoulos, Kesheng Wu

We propose an alternative orthonormalization method that computes the orthonormal basis from the right singular vectors of a matrix. Its advantage are: a) all operations are matrix-matrix...

A Survey of Packages for Large Linear Systems (2000)

Kesheng Wu, Brent Milne

This paper evaluates portable software packages for the iterative solution of very large sparse linear systems on parallel architectures. While we cannot hope to tell individual users which package...

A survey of packages for large linear systems (2000)

Wu, Kesheng, Milne, Brent

This paper evaluates portable software packages for the iterative solution of very large sparse linear systems on parallel architectures. While we cannot hope to tell individual users which package...

A Revised Proposal for a Sparse BLAS Toolkit (2000)

Sandra Carney, Michael A. Heroux, Guangye Li, Kesheng Wu

This paper describes a proposal for a "toolkit" of kernel routines for some of the basic operations in (iterative) sparse numerical methods. In particular, we describe an interface for routines which...

Parallel Efficiency of the Lanczos Method for Eigenvalue Problems (1999)

Kesheng Wu

Two of the commonly used versions of the Lanczos method for eigenvalues problems are the shift-and-invert Lanczos method and the restarted Lanczos method. In this talk, we will address two questions,...

A Parallel Lanczos Method for Symmetric Generalized Eigenvalue Problems (1999)

Kesheng Wu

Lanczos algorithm is a very effective method for finding extreme eigenvalues of symmetric matrices. It requires less arithmetic operations than similar algorithms, such as, the Arnoldi method. In...

Parallel Eciency of the Lanczos Method for Eigenvalue Problems (1999)

Kesheng Wu

Two of the commonly used versions of the Lanczos method for eigenvalues problems are the shift-and-invert Lanczos method and the restarted Lanczos method. In this talk, we will address two questions,...

Inexact Newton Preconditioning Techniques For Large Symmetric Eigenvalue Problems (1999)

Kesheng Wu

. This paper studies a number of Newton methods and use them to define new secondary linear systems of equations for the Davidson eigenvalue method. The new secondary equations avoid some common...

Inexact Newton Preconditioning Techniques for Eigenvalue Problems (1998)

Kesheng Wu, Yousef Saad, Andreas Stathopoulos

The focus of this paper is on numerical methods for finding a few eigenvalues and eigenvectors of a large sparse matrix. New preconditioning schemes are proposed for improving the effectiveness of a...

Inexact Newton Preconditioning Techniques for Eigenvalue Problems (1998)

Kesheng Wu, Yousef Saad, Andreas Stathopoulos

The focus of this paper is on numerical methods for finding a few eigenvalues and eigenvectors of a large sparse matrix. New preconditioning schemes are proposed for improving the effectiveness of a...

Thick-Restart Lanczos Method for Symmetric Eigenvalue Problems (1998)

Kesheng Wu

For real symmetric eigenvalue problems, there are a number of algorithms that are mathematically equivalent, for example, the Lanczos algorithm, the Arnoldi method and the unpreconditioned Davidson...

A Parallel Lanczos Method for Symmetric Generalized Eigenvalue Problems (1998)

Kesheng Wu

Lanczos algorithm is a very effective method for finding extreme eigenvalues of symmetric matrices. It requires less arithmetic operations than similar algorithms, such as, the Arnoldi method. In...

Dynamic Thick Restarting of the Davidson, and the Implicitly Restarted Arnoldi Methods (1997)

Andreas Stathopoulos, Yousef Saad, Kesheng Wu

The Davidson method is a popular preconditioned variant of the Arnoldi method for solving large eigenvalue problems. For theoretical, as well as practical reasons the two methods are often used with...

Dynamic Thick Restarting of the Davidson, and the Implicitly Restarted Arnoldi Methods (1997)

Andreas Stathopoulos, Yousef Saad, Kesheng Wu

The Davidson method is a popular preconditioned variant of the Arnoldi method for solving large eigenvalue problems. For theoretical, as well as practical reasons the two methods are often used with...

Restarted Variants of DQGMRES (1997)

Kesheng Wu

There are two basic types of Krylov iterations, the short recurrence methods, and the long recurrence methods [1], [10], [16]. The Conjugate Gradient (CG) method is a classic example of a short...

DQGMRES: a Direct Quasi-Minimal Residual Algorithm Based on Incomplete Orthogonalization (1997)

Yousef Saad, Kesheng Wu

We describe a Krylov subspace technique, based on incomplete orthogonalization of the Krylov vectors, which can be considered as a truncated version of GMRES. Unlike GMRES(m), the restarted version...

DQGMRES: a Direct Quasi -- Minimal Residual Algorithm Based on Incomplete Orthogonalization (1997)

Yousef Saad, Kesheng Wu

We describe a Krylov subspace technique, based on incomplete orthogonalization of the Krylov vectors, which can be considered as a truncated version of GMRES. Unlike GMRES(m), the restarted version...

Preconditioned Techniques For Large Eigenvalue Problems (1997)

Kesheng Wu

This research focuses on finding a large number of eigenvalues and eigenvectors of a sparse symmetric or Hermitian matrix, for example, finding 1000 eigenpairs of a 100,000 Theta 100,000 matrix....

Preconditioned Techniques For Large Eigenvalue Problems (1997)

Kesheng Wu

This research focuses on finding a large number of eigenvalues and eigenvectors of a sparse symmetric or Hermitian matrix, for example, finding 1000 eigenpairs of a 100,000 Theta 100,000 matrix....

Preconditioned Krylov Subspace Methods for Eigenvalue Problems (1996)

Kesheng Wu, Yousef Saad, Andreas Stathopoulos

this paper we will explore several preconditioned eigenvalue solvers and identify the ones suited for finding large number of eigenvalues. Methods discussed in this paper make up the core of a...

Preconditioned Krylov Subspace Methods for Eigenvalue Problems (1996)

Kesheng Wu, Yousef Saad, Andreas Stathopoulos

this paper we will explore several preconditioned eigenvalue solvers and identify the ones suited for finding large number of eigenvalues. Methods discussed in this paper make up the core of a...

Preconditioned Krylov Subspace Methods for Eigenvalue Problems (1996)

Kesheng Wu, Yousef Saad, Andreas Stathopoulos

this paper we will explore several preconditioned eigenvalue solvers and identify the ones suited for finding large number of eigenvalues. Methods discussed in this paper make up the core of a...

Preconditioned Krylov Subspace Methods for Eigenvalue Problems (1996)

Kesheng Wu, Yousef Saad, Andreas Stathopoulos

this paper we will explore several preconditioned eigenvalue solvers and identify the ones suited for finding large number of eigenvalues. Methods discussed in this paper make up the core of a...

A Revised Proposal for a Sparse BLAS Toolkit (1995)

Sandra Carney, Michael A. Heroux, Guangye Li, Kesheng Wu

This paper describes a proposal for a "toolkit" of kernel routines for some of the basic operations in (iterative) sparse numerical methods. In particular, we describe an interface for routines which...

A Revised Proposal for a Sparse BLAS Toolkit (1995)

Sandra Carney, Michael A. Heroux, Guangye Li, Kesheng Wu

This paper describes a proposal for a "toolkit" of kernel routines for some of the basic operations in (iterative) sparse numerical methods. In particular, we describe an interface for routines which...

A Revised Proposal for a Sparse BLAS Toolkit (1995)

Sandra Carney, Michael A. Heroux, Guangye Li, Kesheng Wu

This paper describes a proposal for a "toolkit" of kernel routines for some of the basic operations in (iterative) sparse numerical methods. In particular, we describe an interface for routines which...

A Revised Proposal for a Sparse BLAS Toolkit (1995)

Sandra Carney, Michael A. Heroux, Guangye Li, Kesheng Wu

This paper describes a proposal for a "toolkit" of kernel routines for some of the basic operations in (iterative) sparse numerical methods. In particular, we describe an interface for routines which...

A Revised Proposal for a Sparse BLAS Toolkit (1995)

Sandra Carney, Michael A. Heroux, Guangye Li, Kesheng Wu

This paper describes a proposal for a "toolkit" of kernel routines for some of the basic operations in (iterative) sparse numerical methods. In particular, we describe an interface for routines which...

A Revised Proposal for a Sparse BLAS Toolkit (1995)

Sandra Carney, Michael A. Heroux, Guangye Li, Kesheng Wu

This paper describes a proposal for a "toolkit" of kernel routines for some of the basic operations in (iterative) sparse numerical methods. In particular, we describe an interface for routines which...

Design of an iterative solution module for a parallel sparse matrix library (P SPARSLIB) (1995)

Yousef Saad, Kesheng Wu

P SPARSLIB is a library of portable FORTRAN routines for parallel sparse matrix computations. The current thrust of the library is in iterative solution techniques. In this note we present the...

DQGMRES: a Direct Quasi -- Minimal Residual Algorithm Based on Incomplete Orthogonalization (1995)

Yousef Saad, Kesheng Wu

We describe a Krylov subspace technique, based on incomplete orthogonalization of the Krylov vectors, which can be considered as a truncated version of GMRES. Unlike GMRES(m), the restarted version...

DQGMRES: a Quasi - minimal residual algorithm based on incomplete orthogonalization (1994)

Youcef Saad, Kesheng Wu

We describe a Krylov subspace technique based on incomplete orthogonalization of the Krylov vectors which can be considered as a truncated version of GMRES. Unlike GMRES the parent algorithm from...

The Iterative Solvers Module (1993)

Youcef Saad, Kesheng Wu

P SPARSLIBis a library of portable FORTRAN routines for sparse matrix compuations. The current thrust of the library is in iterative solution techniques. In this note we present the `accelerators'...

Statistical tests for deterministic effects in broad band time series (1993)

Wu, Kesheng, Savit, Robert, Brock, William

We derive a normalized version of the indicators of Savit and Green, and prove that these normalized statistics have, asymptotically, a normal distribution with a mean of zero and standard deviation...

Thick-Restart Lanczos Method for Symmetric Eigenvalue Problems (1970)

Kesheng Wu

For real symmetric eigenvalue problems, there are a number of algorithms that are mathematically equivalent, for example, the Lanczos algorithm, the Arnoldi method and the unpreconditioned Davidson...