
SELECTED BOOKS
Knowledge Discovery and Data Mining
- "Advances in Knowledge Discovery and Data Mining". Eds.: Fayyad, Piatetsky-Shapiro, Smyth, and Uthurusamy. The MIT Press, 1995.
- "Data Mining. Technologies, Techniques, Tools, and Trends". B. Thuraisingham. CRC, 1998.
- "Data Mining. A hands-on approach for business professionals". R. Groth. Prentice Hall, 1998.
- "Data Preparation for Data Mining". Dorian Pyle, 3/99.
- "Data Mining". P. Adriaans & D. Zantinge
- "Data Mining Methods for Knowledge Discovery" Cios, Pedrycz, & Swiniarski, 1998.
- "Data Mining Techniques for Marketing, Sales and Customer Support". Berry & Linoff.
- "Decision Support using Data Mining". Anand and Buchner.
- "Feature Selection for Knowledge Discovery and Data Mining". Liu and Motoda.
- "Feature Extraction, Construction and Selection: A Data Mining Perpective". Eds: Motoda and Liu.
- "Knowledge Acquisition from Databases". Xindong Wu.
- "Mining Very Large Databases with Parallel Processing". Alex Freitas, Simon Lavington.
- "Predictive Data-Mining: A Practical Guide". Weiss & Indurkhya.
- "Machine Learning and Data Mining: Methods and Applications." Michalski, Bratko, and Kubat, 1998; John Wiley & Sons.
- "Mining Very Large Databases with Parallel Processing". Freitas & Lavington.
- "Rough Sets and Data Mining: Analysis of Imprecise Data." Eds: Lin and Cercone; Kluwer.
- "Seven Methods for Transforming Corporate Data into Business Intelligence". Vasant Dhar and Roger Stein; Prentice-Hall, 1997.
- Web Data Mining Exploring Hyperlinks, Contents, and Usage Data Series: Data-Centric Systems and Applications Bing Liu. 2007.
Machine Learning
- "Machine Learning". Tom M. Mitchell. McGraw-Hill, 1997.
- "Elements of Machine Learning". P. Langley. Morgan Kaufmann Publishers, Inc. 1996.
- See http://www.aic.nrl.navy.mil/~aha/research/ml/books.html for an extensive list of ML books organized by topics.
General AI
- "Artificial Intelligence: A Modern Approach". S. Russell, P. Norvig. Prentice Hall, 1995. ISBN 0-13-103805-2
- "Artificial Intelligence: Theory and Practice". T. Dean, J. Allen, Y. Aloimonos. The Benjamin/Cummings Publishing Company, Inc. 1995.
- "Readings in Artificial Intelligence". B. L. Webber, N. J. Nilsson, eds. Tioga Publishing Company, 1981.
- "Artificial Intelligence". 3rd edition. Patrick H. Winston. Addison Wesley.
- "The Elements of Artificial Intelligence Using Common Lisp". S. L. Tanimoto. Computer Science Press 1990.
- "Artificial Intelligence" Second edition. E. Rich and K. Knight. McGraw Hill 1991.
- "Paradigms of Artificial Intelligence Programming: Case Studies in Common Lisp". P. Norvig. Morgan Kaufmann Publishers, 1992.
- "Essentials of Artificial Intelligence". M. Ginsberg. Morgan Kaufmann Publishers, 1993.
- "Artificial Intelligence Structures and Strategies for Complex Problem Solving". Third edition. G. F. Luger and W. A. Stubblefield. Addison-Wesley, 1998.
- "Logical Foundations of Artificial Intelligence". M.R. Genesereth and N. Nilsson. Morgan Kaufmann, 1987.
Databases
- "A First Course in Database Systems". J. Ullman, J. Widom. Prentice-Hall, 1997.
- "Database Management Systems", 2nd ed. R. Ramakrishnan. McGraw-Hill, 1999.
- "Readings in Database Systems". 2nd Edition. Ed. M. Stonebraker. 1994, Morgan Kaufmann.
Statistics
- "Statistical Inference for Management and Economics". P. Billingsley, D. Croft, D. Huntsberger, C. Watson. Boston: Allyn and Bacon, Inc. 1986.
- "Probability and Statistics". 2nd edition. M. DeGroot. Addison Wesley, 1986.
- "Statistical Inference". G. Casella, R. Berger. Wadsworth and Brooks/Cole, 1990.
SELECTED RESEARCH PAPERS
The Knowledge Discovery in Databases (KDD) Process
- Fayyad, U., Piatetsky-Shapiro, G., and Smyth, P. "From Data Mining to Knowledge Discovery in Databases" AAAI Magazine, pp. 37-54. Fall 1996.
- Bhandari, I. et al. "Advanced Scout: Data mining and Knowledge Discovery in NBA Data" Data Mining and Knowledge Discovery Journal, Vol 1, pp 121-125. 1997.
Data Warehouses, OLAP and Multidimensional Analysis
- J. Widom, "Research Problems in Data Warehousing" Fourth Int'l Conf. on Information and Knowledge Management (CIKM) 1995.
- S. Chaudhuri and U. Dayal "An overview of data warehousing and OLAP technology" ACM SIGMOD Record, 26(1):65-74, 1997.
- (Not included in the exam) Gray, J. el al. "Data Cube: A relational aggregation operator generalizing group-by, cross-tab, and sub-totals" Data Mining and Knowledge Discovery Journal, Vol 1, pp 29-53. 1997.
Pre-Processing, Feature Selection
- Langley, P. "Selection of Relevant Features in Machine Learning" Proceedings of the AAAI Fall Symposium on RElevance. New Orleans, LA. AAAI Press. 1994.
- M.W. Berry, Z. Drmac, and E.R. Jessup. "Matrices, Vector Spaces, and Information Retrieval" SIAM Reviews. Vol. 41, No. 2, pp. 335-362.
- (Not included in the exam) Barbara et al. "The New Jersey Data Reduction Report" Bulletin of the IEEE Computer Sociaty Technical Committee on Data Engineering.
Mining Association Rules
- R. Agrawal, T. Imilinski, and A. Swami "Mining Association rules between sets of items in large databases" Proc. of the ACM SIGMOD Int'l Conference on Management of Data, Washington D.C., May 1993, 207-216. PostScript and PDF Online.
- R. Agrawal, R. Srikant: "Fast Algorithms for Mining Association Rules" Proc. of the 20th Int'l Conference on Very Large Databases, Santiago, Chile, Sept. 1994. PostScript and PDF Online.
Mining Sequential Patterns and Similar Time Sequences
- Srikant, R. and Agrawal, R. "Mining Sequential Patterns: Generalizations and Performance Improvements" Proc. of the Fifth Int'l Conference on Extending Database Technology (EDBT), Avignon, France, March 1996.
- Mannila, H., Toivonen, H., and Verkamo, A.I. "Discovery of frequent episodes in sequences" First International Conference on Knowledge Discovery and Data Mining (KDD'95) 210 - 215, Montreal, Canada, August 1995.
- (Not included in the exam) R. Agrawal, C. Faloutsos and A. Swami. "Efficient Similarity Search in Sequence Databases Foundations of Data Organization and Algorithms". (FODO) Conference, Oct. 1993, Evanston, Illinois, Oct. 13-15, 1993. PostScript Online.
- (Not included in the exam) C. Faloutsos, M. Ranganathan and Y. Manolopoulos. "Fast Subsequence Matching in Time-Series Databases". Proc. ACM SIGMOD, May 25-27, 1994, Minneapolis, MN. pp. 419-429. PostScript Online.
Classification: Decision Trees
- J. R. Quinlan. "C4.5: Programs for Machine Learning". Morgan Kaufmann Publishers. 1993. Chapters 1 and 2.
- J.R. Quinlan. "Induction of Decision Trees". Machine Learning 1:81-106. 1986.
- R. Rastogi and K. Shim "PUBLIC: A Decision Tree Classifier that Integrates Building and Pruning" Proc. of the 24th VLDB Conference, NY USA. 1998. PostScript Online.
Rule-Based Mining: Inductive Logic Programming
- J.R. Quinlan. "Learning Logical Definitions from Relations". Machine Learning 5:239-266. 1990.
- I. Bratko and S. Muggleton. "Applications of Inductive Logic Programming". Communications of the ACM. Vol. 38, No. 11, pp 65-70. 1995. Available online from the WPI library (e-journal collection)
Regression: Instance-Based Learning
- Tom M. Mitchell "Machine Learning" McGraw-Hill 1997. Chapter 8.
Evaluation of Patterns and Visualization
- J.A. Wise, J.J. Thomas, K. Pennock, D. Lantrip, M. Pottier, A. Schur, V. Crow. "Visualizing the Non-Visual: Spatial Analysis and Interaction with Information from Text Documents". Proc. IEEE Information Visualization Symposium, pp 51-58. IEEE Computer Society Press. 1995.
Clustering
- S. Guha, R. Rastogi and K. Shim. "CURE: An efficient algorithm for clustering large databases". In Proceedings of ACM-SIGMOD 1998 International Conference on Management of Data, Seattle, 1998. Available from: http://www.bell-labs.com/user/rastogi/
- P. S. Bradley, U. M. Fayyad and C. Reina. "Scaling Clustering Algorithms to Large Databases". Fourth International Conference on Knowledge Discovery & Data Mining KDD-98, pages 9-15. AAAI Press, Menlo Park, CA, 1998. Available from: http://www.research.microsoft.com/users/bradley/papers.html
Web Mining, XML
- Cooley, Bamshad Mobasher, and J. Srivastava, "Web Mining: Information and Pattern Discovery on the World Wide Web." Proceedings of the 9th IEEE International Conference on Tools with Artificial Intelligence (ICTAI'97), November 1997. Available from http://maya.cs.depaul.edu/~mobasher/pubs.html
- M. Craven, D. DiPasquo, D. Freitag, A. McCallum, T. Mitchell, K. Nigam and S. Slattery. "Learning to Extract Symbolic Knowledge from the World Wide Web." Proceedings of the 15th National Conference on Artificial Intelligence (AAAI-98). Available from http://www.cs.cmu.edu/~webkb/
- Reference for XML.
OTHER ONLINE RESOURCES
Data Sets
- Time Series Data Library
- Data Repositories
- Datasets for Data Mining
- CMU's StatLib-Datasets Archive
- Miscellaneous
KDD
- ACM SIGKDD.
- KDNuggets: Data Mining and Knowledge Discovery Resources
- RPI's Data Mining Links
- KDD Research Groups
KDD Commercial Products / Prototypes
Data Warehousing and OLAP
- The Data Warehousing Information Center
- Data Warehousing and OLAP - A Research-Oriented Bibliography
- OLAP Council White Paper
Machine Learning
- Online Machine Learning Resources
- Machine Learning Resources
- Machine Learning Papers
- UCI Machine Learning
- Reinforcement Learning and Friends at Carnegie Mellon
- A Bibliography on Automatic Text Categorization
Statistics
General AI

![[Return to the CS Homepage]](http:/images/new_cs.gif)