Zhongfang Zhuang

Zhongfang Zhuang

 Zhongfang Zhuang Zhongfang Zhuang

About Me

I am a PhD candidate in the Computer Science Department at WPI. I am working in Data Science Research Group (DSRG) with Professor Elke Rundensteiner. My research interests are in the areas of database and information management system, including query processing and optimization, large-scale data management and MapReduce-related technologies.

Ongoing Project

Project "Dashwood": Unsupervised Fraud Detection

Functional Frauds are the most complex and general, as they derive from an improper use of the service APIs exposed by potentially all components of a system. This project aims at design a functional fraud-detection framework capable of raising alerts on suspicious connections and activity, and take appropriate corrective actions on time, focusing on scalability, heterogeneous data, dynamic user behavior model and real-time requirement.

Past Projects and Publications


PRO is a preference-aware recurring query processing system that produces a recurring execution configuration that meets the application guidelines expressed via preference models. We propose an approach to tackle this maximal preference execution configuration problem using a PRO execution relation graph (ERG) model that effectively incorporates the dependencies between executions. This enables us to transform this problem into the well-known minimum weight length-k path problem, and to further design a dynamic-programming based pseudo-polynomial solution, called PRO-OPT. We also introduce adaptive re-optimization techniques to tackle the problem of fluctuating stream workloads.

Zhongfang Zhuang, Chuan Lei, Elke A. Rundensteiner, and Mohamed Eltabakh. "PRO: Preference-Aware Recurring Query Optimization."
CIKM 2016 [Poster][PDF]


Helix is the first scalable multi-query sharing engine tailored for recurring workloads in the MapReduce infrastructure. Helix deploys new sliced window-alignment techniques to create sharing opportunities among recurring queries without introducing additional I/O overheads or unnecessary data scans. It introduces a cost/benefit model for creating a sharing plan among the recurring queries, and a scheduling strategy for executing them to maximize the SLA satisfaction.

Chuan Lei, Zhongfang Zhuang, Elke A. Rundensteiner, and Mohamed Eltabakh. "Shared execution of recurring workloads in MapReduce."
VLDB 2015 [PDF]

Redoop Infrastructure Demonstration

This demonstration presents the Redoop infrastructure, the first full-fledged MapReduce framework with native support for recurring big data queries. We demonstrate Redoop’s capabilities on a compute cluster with real life workloads including click-stream and sensor data analysis.

Chuan Lei, Zhongfang Zhuang, Elke A. Rundensteiner, and Mohamed Eltabakh. "Redoop infrastructure for recurring big data queries."
VLDB 2014. [PDF]

Service Activity

External Reviewer for

EDBT 2014, 2017 VLDB 2015, ICDE 2016, SIGMOD 2015, 2017


M.Eng., Beijing University of Posts and Telecommunications, 2013

B.Eng., Xi'an University of Posts and Telecommunications, 2011