Curriculum Vitae


professional in revealing and transforming the knowledge embedded in unstructured data to accelerate the information retrieval and knowledge discovery from complex data sets. An adaptable data scientist with the ability to deliver data-driven knowledge to other team members.


  • /C++, Java, Python, R, Matlab/Octave, SQL, PostgreSQL, JavaScript, HTML,
  • CSS Gensim, NLTK, sci-kit-learn, Django, Redis, Celery, Spark, Keras, pytorch
  • Deep Learning (CNN, KNN, LSTM, etc.), MapReduce, Hadoop


Vanderbilt University
Doctor of Philosophy – Ph.D., Computer Science · (2012 – 2019)

Beihang University
Master’s Degree, Computer Software Engineering · (2007 – 2009)

Beihang University
Bachelor’s degree, Information Technology · (2003 – 2007)


 Vanderbilt University, Research Assistant

May 2015 – Present, Greater Nashville Area, TN

  • Accelerate chart reviews with similar terms extracted from EMR-based word embeddings;
  • Customize search engines and text highlighting to reduce the turnaround time of chart reviews;
  • Design automatic query recommendation system to boost the accuracy of cohort discovery;
  • Develop and maintain the Summit chart review system to manage chart review projects;

Vanderbilt University
Graduate Affiliate of Vanderbilt Institute for Digital Learning, June 2014 – May 2015 (1 year), Greater Nashville Area, TN

  • Revealed high-quality features to predict users’ drop-out within one week in Massive Open Online Courses (MOOCs);
  • Identified interaction peaks in online lectures strongly associated with users’ drop-out;
  • Constructed fine-grained temporal features for predicting students’ performance in MOOCs;

Vanderbilt University
Member of Teachable Agents Group, May 2013 – May 2015 (2 years one month), Greater Nashville Area, TN

  • Developed hierarchical sequential data mining methods to analyze educational data;
  • Mined multi-feature hierarchical (MFH) patterns that associated with students’ performances;
  • Identified temporal MFH features to predict students’ learning behaviors.

Vanderbilt University, Teaching Assistant, August 2012 – May 2015 (2 years 10 months), Greater Nashville Area, TN

  • CS292 “Big Data” of 2015 Spring Semester.
  • CS360 “Advanced Artificial Intelligence” of 2014 Fall Semester.
  • CS250 “Algorithms” of 2014 Spring Semester.
  • CS103 “Introductory Programming for Engineers
    and Scientists” of 2013 Fall Semester.
  • CS103“IntroductoryProgrammingforEngineersandScientists” of 2013 Summer Semester.
  • CS103 “Introductory Programming for Engineers and Scientists” of 2013
    Spring Semester.
  • CS151 “Computers and Ethics ” of 2012 Fall Semester.

Sugon, Senior Software Engineer
October 2009 – May 2012 (2 years eight months), Beijing City, China

  • Invented parallel encryption framework (four national patterns) running in encryption devices;
  • Tailed compressed Linux system and developed device drivers for embedded devices;
  • Leaded a team in developing the distributed cryptographic network file system.

Beihang University, Instructor
March 2011 – March 2012 (1 year one month)Beijing City, China

  • Design syllabus for graduate course “Linux device driver development.”
  • Teach Linux kernel analysis, device driver development, and direct experiments.

Beihang University, Research Assistant
October 2007 – October 2009 (2 years one month), Beijing City, China

  • Proposed a smart wireless sensor network(WSN) protocol based on swarm intelligence to accomplish complex distributed information collection tasks;
  • Generalized the intelligent algorithms for WSN to Multi-robot system for fire distinguish in large forest areas.

Beihang University, Research Assistant, September 2006 – June 2007 (10 months)Beijing City, China

  • Tailed uC/OS-II operation system and developed device drivers for industrial printing system;
  • Optimized the task schedule software to significantly increase the speed and accuracy of printing systems.



  • Learning Behavior Characterization with Multi-Feature, Hierarchical Activity Sequences
  • Early Prediction of Student Dropout and Performance in MOOCs using Higher Granularity Temporal Information
  • A Crowdsourcing Framework for Medical Data Sets
  • Extracting similar terms from multiple EMR-based semantic embeddings to support chart reviews
  • Behavior Prediction in MOOCs using Higher Granularity Temporal Information