Aysa X. Fan

PhD Candidate · University of Illinois Urbana-Champaign

My research sits at the intersection of natural language processing (NLP), educational data mining, and computer science education. I'm broadly interested in how people learn, and how NLP and large language models can support that — from improving how novice programmers develop skills, to enabling more efficient classification with limited labeled data. My work has been published at venues including EMNLP, EACL, EDM, SIGCSE, and ICQE, with two best paper nominations. Before my PhD, I spent five years in industry working in NLP, data quality, and QA. Outside of research, I co-founded a learning center where I taught children's art — never touching a student's work, instead guiding observation and self-expression through language and demonstration.

NLP Educational Data Mining Large Language Models CS Education Learning Analytics
Aysa X. Fan at the Alma Mater, UIUC

Education

2019 – 2026 (Expected)
Doctor of Philosophy, Curriculum & Instruction, DELTA
University of Illinois Urbana-Champaign
Advisor: Dr. Luc Paquette
2016 – 2018
Master’s Degree, Information and Data Science
University of California, Berkeley
Honours Bachelor of Science, Specialist in Mathematics and Its Applications in Finance and Economics, Major in Statistics
University of Toronto

Selected Publications

Designing studies and building computational methods to analyze how students learn

NLP & LLMs for Learning Analytics

JEDM (Under Review)
Evaluating LLM-Based Classification of Student Debugging Strategies
Aysa X. Fan, Qianhui Liu, Luc Paquette, Juan Pinto
2025
ICQE 2024Best Paper Nominee
Using LLM-based Filtering to Develop Reliable Coding Schemes for Rare Debugging Strategies
Aysa X. Fan, Qianhui Liu, Luc Paquette, Juan Pinto
2024
EACL 2023
CONENTAIL: An Entailment-based Framework for Universal Zero and Few Shot Classification with Supervised Contrastive Pretraining
Ranran Haoran Zhang, Aysa X. Fan, Rui Zhang
2023

LLMs for Programming Education

SIGCSE 2024
Enhancing Code Tracing Question Generation with Refined Prompts in Large Language Models
Aysa X. Fan, Rully A. Hendrawan, Yang Shi, Qianou Ma
2024
LLM4Ed Workshop 2024
Evaluating the Quality of Code Comments Generated by Large Language Models for Novice Programmers
Aysa X. Fan, Arun Balajiee Lekshmi Narayanan, Mohammad Hassany, Jiaze Ke
2024
EMNLP 2023 Findings
Exploring the Potential of Large Language Models in Generating Code-Tracing Questions for Introductory Programming Courses
Aysa X. Fan, Ranran Haoran Zhang, Luc Paquette, Rui Zhang
2023

Experience

From industry NLP to education research

Sep 2019 – Present
Graduate Research Assistant
University of Illinois Urbana-Champaign
DELTA program, HEDS Lab (advised by Dr. Luc Paquette)
  • Dissertation: Randomized experiment comparing instructional approaches for code tracing, with both quantitative and qualitative analysis of LLM-generated feedback
  • Developed LLM-based filtering methods to build reliable coding schemes for student debugging strategies
  • Evaluated LLM classification of student learning behaviors across multiple models and prompting strategies
  • Investigated LLM-generated code-tracing questions and code comments for novice programmers
  • Applied collaborative sketch tools in clustering and video analysis for engineering problem-solving
Jun 2017 – Jul 2025
Director of Education & Co-founder
Novel Panda Learning Centre
  • Co-founded a learning center offering children's art and enrichment programs
  • Designed curricula and taught drawing, painting, and crafting classes
  • Practiced a hands-off teaching philosophy: guiding observation and self-expression through language, gesture, and demonstration
May 2016 – May 2019
QA Analyst
Kik Interactive
  • Performed user segmentation and product analysis using SQL/Python/Excel
  • Produced product quality and user experience reports to support data-driven decisions
  • Built quality dashboards in Redash/Jira/Kibana
  • Developed and executed test plans/suites for new features; managed bug reporting and triage in Jira
  • Created product component documentation and delivered competitive analyses
Aug 2014 – May 2016
Data & Quality Analysis Team Lead
Maluuba (acquired by Microsoft)
  • Led a team coordinating crowdsourcing projects to collect, clean, and annotate datasets across 10 supported languages
  • Designed and refined tests for crowdsourcing NLP data collection, improving data quality at scale
  • Developed data guidelines to enhance NLP capabilities
  • Led testing documentation for OEM projects; managed triage of Jira reports
Apr 2014 – Aug 2014
Data Analyst
Maluuba
  • Managed collection and preparation of test datasets for ML models
  • Developed data cleaning and annotation strategies for NLP tasks
  • Led the M-Fit fitness app project at the Maluuba Hackathon (2nd place)