SELECTED BOOKS
Knowledge Discovery and Data Mining
Machine Learning
Databases
Statistics
SELECTED RESEARCH PAPERS
The Knowledge Discovery in Databases (KDD) Process
- Fayyad, U., Piatetsky-Shapiro, G., and Smyth, P. "From Data Mining to Knowledge Discovery in Databases" AAAI Magazine, pp. 37-54. Fall 1996.
Pre-Processing, Feature Selection
Mining Sequential Patterns and Similarity Search
- R. Agrawal, C. Faloutsos and A. Swami. "Efficient Similarity Search in Sequence Databases Foundations of Data Organization and Algorithms". (FODO) Conference, Oct. 1993, Evanston, Illinois, Oct. 13-15, 1993. PostScript Online.
- C. Faloutsos, M. Ranganathan and Y. Manolopoulos. "Fast Subsequence Matching in Time-Series Databases". Proc. ACM SIGMOD, May 25-27, 1994, Minneapolis, MN. pp. 419-429. PostScript Online.
Clustering
- S. Guha, R. Rastogi and K. Shim. "CURE: An efficient algorithm for clustering large databases". In Proceedings of ACM-SIGMOD 1998 International Conference on Management of Data, Seattle, 1998. Available from: http://www.bell-labs.com/user/rastogi/
- P. S. Bradley, U. M. Fayyad and C. Reina. "Scaling Clustering Algorithms to Large Databases". Fourth International Conference on Knowledge Discovery & Data Mining KDD-98, pages 9-15. AAAI Press, Menlo Park, CA, 1998. Available from: http://www.research.microsoft.com/users/bradley/papers.html
OTHER ONLINE RESOURCES
Data Sets
- Univ. of California Irvine Machine Learning Data Repository.
- Univ. of California Irvine KDD Data Repository.
- Datasets for Data Mining
- Time Series Data Library
- CMU's StatLib-Datasets Archive
KDD
KDD Commercial Products / Prototypes
Data Warehousing and OLAP
Machine Learning
Statistics
General AI