WPI Worcester Polytechnic Institute

Computer Science Department

Knowledge Discovery and Data Mining Research Group 

Miscellaneous Notes on Python 

Lots of the text and materials posted on this page were produced by Ahmedul Kabir (thanks Kabir!)

General Information about Python:

Python Tutorials:

Python Books:

Python Environments:

Python Data Mining Packages:

Python has many open source packages available specifically for Data Mining and Knowledge Management. Here is a list of the most widely used ones, along with brief descriptions:

Note: Python Package Index: All Python packages can be searched by name or keyword in the Python Package Index.

Data Preprocessing:

Model Evaluation:

Decision Trees:

Linear Regression:

Regression using Trees:

Association Rules:


- K-means: http://scikit-learn.org/stable/modules/generated/sklearn.cluster.KMeans.html
- Hierarchical: http://scikit-learn.org/stable/modules/generated/sklearn.cluster.AgglomerativeClustering.html
- DBSCAN: http://scikit-learn.org/stable/modules/generated/sklearn.cluster.DBSCAN.html

- K-Means: http://orange.biolab.si/docs/latest/reference/rst/Orange.clustering.kmeans.html
- Hierarchical: http://orange.biolab.si/docs/latest/reference/rst/Orange.clustering.hierarchical.html

- K-means and hiearchical: http://mlpy.sourceforge.net/docs/3.3/cluster.html

- An independent implementation of DBSCAN: http://iamtawit.blogspot.in/2012/12/dbscan.html

Text Mining:

[Return to the WPI Homepage]  [Return to the CS Homepage]