WPI Worcester Polytechnic Institute

Computer Science Department
------------------------------------------

Knowledge Discovery and Data Mining Research Group 
KDDRG

Miscellaneous Notes on Python 

Lots of the text and materials posted on this page were produced by Ahmedul Kabir (thanks Kabir!)
------------------------------------------

General Information about Python:


Python Tutorials:


Python Books:


Python Environments:


Python Data Mining Packages:

Python has many open source packages available specifically for Data Mining and Knowledge Management. Here is a list of the most widely used ones, along with brief descriptions:

Note: Python Package Index: All Python packages can be searched by name or keyword in the Python Package Index.


Data Preprocessing:


Model Evaluation:


Decision Trees:


Linear Regression:


Regression using Trees:


Association Rules:


Clustering:

Scikit-learn:
- K-means: http://scikit-learn.org/stable/modules/generated/sklearn.cluster.KMeans.html
- Hierarchical: http://scikit-learn.org/stable/modules/generated/sklearn.cluster.AgglomerativeClustering.html
- DBSCAN: http://scikit-learn.org/stable/modules/generated/sklearn.cluster.DBSCAN.html

Orange:
- K-Means: http://orange.biolab.si/docs/latest/reference/rst/Orange.clustering.kmeans.html
- Hierarchical: http://orange.biolab.si/docs/latest/reference/rst/Orange.clustering.hierarchical.html

MLPy:
- K-means and hiearchical: http://mlpy.sourceforge.net/docs/3.3/cluster.html

- An independent implementation of DBSCAN: http://iamtawit.blogspot.in/2012/12/dbscan.html

Text Mining:


------------------------------------------
[Return to the WPI Homepage]  [Return to the CS Homepage]