Zhongfang Zhuang

Ph.D. Candidate @ Data Science Research Group

Computer Science Department, Worcester Polytechnic Institute

Links:   LinkedIn      CV         PGP


About Me

I am a PhD candidate in the Computer Science Department at WPI. I am working in Data Science Research Group (DSRG) with Professors Elke Rundensteiner and Xiangnan Kong. My research interest includes data mining in real-world problem settings, deep learning models and large-scale data management infrastructures.


Deep Learning On Attributed Sequences

* Expected Defense: Jan 2019

Committee Members

Dr. Elke Rundensteiner, Advisor, WPI
Dr. Xiangnan Kong, Co-advisor, WPI
Dr. Mohamed Eltabakh, WPI
Dr. Philip Yu, University of Illinois at Chicago

Recent News

Paper Accepted!

Our paper titled "One-shot Learning on Attributed Sequences" has been accepted by IEEE Big Data 2018 (acceptance rate 18.9%).

Patent Application

Part of my dissertation work is included in the patent application of
"Machine Learning Systems and Methods for Attributed Sequences."
US Patent Application 16/057,025
French Patent Application FR1857430

Projects in Deep Learning

Attention Network for Fraud Detection

May 2018 -- Oct 2018

  • Designed and implemented a novel neural attention model for attributed sequence classification.
  • Integrated conventional sequence attention model with the attributes from user profiles.
  • Evaluated the proposed model and compared with state-of-the-art approaches to confirm its effectiveness.

Zhongfang Zhuang, Xiangnan Kong, and Elke Rundensteiner. "AMAS: Attention Model for Attributed Sequence Classification", in submission.

Fraud Detection in One Shot

Dec 2017 -- Apr 2018

  • Challenged by the real-world scenario that only one fraud case per fraud type is available.
  • Designed a multimodal siamese neural network that is capable of generalizing from only one example.
  • Studied and evaluated the proposed model in various real-world scenarios with diverse parameter settings.

Zhongfang Zhuang, Xiangnan Kong, Elke Rundensteiner, Jihane Zouaoui, and Aditya Arora. "One-shot Learning on Attributed Sequences", accepted by IEEE Big Data 2018 (acceptance rate 18.9%).

Incorporate User Feedback for Fraud Detection

Mar 2017 -- Dec 2017

  • Identified the challenges of incorporating the feedback from human domain experts in fraud detection.
  • Formulated the problem of deep metric learning on attributed sequences.
  • Designed and implemented a deep learning framework to effectively learn from the human feedback.
  • Evaluated the purposed model and confirmed it outperforms state-of-the-art in various mining tasks.

Zhongfang Zhuang, Xiangnan Kong, Elke Rundensteiner, Jihane Zouaoui, and Aditya Arora. "Deep Metric Learning on Attributed Sequences", in submission.

Unsupervised Attributed Sequence Embedding

Jan 2016 -- Feb 2017

  • Proposed a new data model, the attributed sequence, for Amadeus application log files.
  • Identified the challenges of using attributed sequences in fraud detection: attributed sequences are not represented as feature vectors that could be used directly by existing data mining algorithms.
  • Designed a multimodal neural network model with a sequence network and an attribute network.
  • Tailored an unsupervised training strategy to learn the information from attributed sequences.
  • Evaluated the performance of the proposed neural network model in clustering and outlier detection tasks.
  • Conducted case studies by using visualization tools and collaborating with domain experts.

Zhongfang Zhuang, Xiangnan Kong, Elke Rundensteiner, Jihane Zouaoui, and Aditya Arora. "Attributed Sequence Embedding", in submission.

Projects in Big Data Infrastructure

Preference-aware Recurring Query Optimization

Oct 2014 -- Dec 2015

  • Formulated the problem of preference-aware recurring query optimization in the big data domain.
  • Designed and implemented PRO, the first preference-aware optimizer for recurring queries on large-scale data processing platforms.
  • Modeled the preference-aware recurring query optimization problem with an execution relation graph and tackled it as a pathfinding problem.
  • Enabled big data processing platforms, such as Apache Hadoop and Apache Spark, dynamically optimize workload processing and maximally satisfying user preferences.

Zhongfang Zhuang, Chuan Lei, Elke A. Rundensteiner, and Mohamed Eltabakh. "PRO: Preference-Aware Recurring Query Optimization." CIKM 2016
Zhongfang Zhuang, Chuan Lei, Elke Rundensteiner, and Mohamed Eltabakh. "Preference-aware Recurring Query Optimization," in Journal submission.

Redoop Infrastructure for Recurring Big Data Queries

Jun 2014 -- Aug 2014

  • Developed the Redoop infrastructure, as the first full-fledged MapReduce framework, to support the processing of the recurring big data queries.
  • Designed and developed a web-based interface for Redoop to visualize the performance at each stage in the job processing.

Chuan Lei, Zhongfang Zhuang, Elke A. Rundensteiner, and Mohamed Eltabakh. "Redoop infrastructure for recurring big data queries." VLDB 2014.

Recurring Query Optimization

Sep 2013 -- Aug 2014

  • Developed Helix, the first scalable multi-query sharing engine for the recurring workloads in MapReduce.
  • Helix exploits new sliced window-alignment techniques to create sharing opportunities among recurring queries without introducing additional I/O overheads or unnecessary data scans.
  • Introduced a cost/benefit model for creating a sharing plan among the recurring queries, and a scheduling strategy for executing them to maximize the SLA satisfaction.

Chuan Lei, Zhongfang Zhuang, Elke A. Rundensteiner, and Mohamed Eltabakh. "Shared execution of recurring workloads in MapReduce." VLDB 2015

Service Activity

External Reviewer for

EDBT 2014, 2017 VLDB 2015, ICDE 2016, SIGMOD 2015, 2017, 2018


Master, Beijing University of Posts and Telecommunications, 2013

Bachelor, Xi'an University of Posts and Telecommunications, 2011