CS Student amongst KDD Cup Winners
Prof Neil Heffernan reports that his fourth year Ph.D. student Zach Pardos placed 2nd among student teams and 4th place overall, out of over 600 competitors, in this year's Knowledge Discovery and Data-mining Cup, a high profile yearly data mining competition run by the Association of Computing Machinery. This is a very significant achievement.
Zach will receive $3,000 in prize money and also some travel funds to go to SIGKDD and will have several opportunities to present his solution at the conference which will be held in Washington, D.C. in July 2010. His solution will also appear in an upcoming issue of the Journal of Machine Learning Research.
Engineering competitions often lead to practical solutions but not always good science. However, in this case, not only did Zach do well, but he has also done important scientific work that will lead to several publications comparing multiple different modeling approaches.
Heffernan reports that the current state of the art in the field has been Knowledge Tracing, a technique develop by Corbett and Anderson in 1995. It is used by millions of students across America in many different pieces of educational software to track student knowledge. Zach's new method, that served as the foundation for his KDD solution, is a model that tracks learning rates, guessing rates and other characteristic of the individual user. His paper at this summer's Educational Data Mining conference shows how this new model clearly outperforms the Knowledge Tracing approach. This work allows for better models of student learning which have significant practical implications. With a better model, existing educational software can be improved in a myriad of ways.
The KDD Cup has addressed a variety of practical problems. For instance, a previous year's competition focused on processing mammography images to determine if a patient had malignant breast cancer. This year's competition consisted of student responses from an intelligent tutoring system that is used by millions of students in the US. Zach and the other competitors were given 30 million rows of training data containing details of students' responses to algebra questions. The task was to predict if a student was going to answer correctly or incorrectly on 1.3 million rows of the test dataset.
The competition was very stiff. The team that came in one position above Zach was "Big Chaos", part of the winners of the $1 million dollar NetFlix Challenge prize. So Zach was up against some well financed professional data mining teams! He was second in the "student" category to a team of 24 Taiwanese students. This result makes it clear that WPI is a world leader in this field.
Zach and Neil would like to thank Jesse Banning, Siamak Najafi, Elke Rundensteiner, Mark Taylor, and Michael Voorhis for their technology support throughout the competition. Zach also benefited from funding from multiple sources including the National Sciences Foundation and U.S. Department of Education. He also benefited from support from the WPI CS Department.