Getting started =============== .. _mlpy: http://mlpy.sourceforge.net/ .. _libsvm: http://www.csie.ntu.edu.tw/~cjlin/libsvm/ Prerequisites ************* .. _Prerequisites: In order to use this package, a number of software packages are needed. All development and testing has been done on Linux (ubuntu). It should in principle work on most other operating systems as well. Please refer to the individual software packages for installation procedures. python Obviously, python is needed. It is recommended to have the interactive interpreter ipython which provides useful features like command history, on-line help with paging, identifier completion, etc. An SQL database server This is required for the :mod:`pysteg.sql` subpackage. Only postgres has been used in testing and examples, and is therefore recommended. The command line client psql is useful to inspect database contents. It should work with other database systems supported by the SQLObject API, but no promises. SQLObject This is required for the :mod:`pysteg.sql` subpackage. A recent version of the python library :mod:`SQLObject` is needed. In testing version 0.12.4 has been used. The version included in Ubuntu 11.10 is insufficient. PIL The python imaging library. numpy/scipy/pywt The :mod:`numpy` and :mod:`scipy` libraries for python provide functions and data types for scientific computing, and matrix algebra in particular. The :mod:`pywt` package provides wavelet operations. The default versions available in ubuntu work fine. mlpy The python Machine Learning library, mlpy_, provides access to libSVM as well as other classifiers. We have used version 3.5.0; the Ubuntu package python-mlpy provides an outdated version. You will probably have to install :mod:`mlpy` from source. At the time of writing, the head of the mlpy mercurial repo is incompatible with the published tarball (3.5.0). You will have to use the tarball. Hopefully, this will change. Versions of pysteg up to 0.96 inclusive used libSVM directly, so for these versions, libsvm_ should be installed instead of :mod:`mlpy`. Download and installation ************************* Note that the system is under development and not bug free at the moment. If you want to use it, please get in touch. I am currently working on the system, and would be only to happy for any input. Tarballs are available for download. __ http://www.ifs.schaathun.net/pysteg/pysteg1.0.tar.gz * Version 1.0__ (2012-01-01) has fixed a number of bugs and tried to make the code more readable in the SQL module. Ensemble classifiers have not yet been tested. We have tried to make useful documentation but it has not been proof read and should not be expected to be complete. __ http://www.ifs.schaathun.net/pysteg/pysteg0.98.tar.gz * Version 0.98__ has speeded up feature calculation by reducing the number of SQL queries made. __ http://www.ifs.schaathun.net/pysteg/pysteg0.97.tar.gz * Version 0.97__ switched to using the :mod:`mlpy` interface. It has roughly the same functionality as Version 0.96; i.e. still only SVM is supported, but the hooks to extend it for other classifiers are there. There is a serious problem with a bottle neck in the SQL server connection because two many small queries are made. __ http://www.ifs.schaathun.net/pysteg/pysteg0.96.tar.gz * Version 0.96__ is the last version to use the libSVM API, supporting SVM as the only classifier algorithm. Unfortunately, easy-to-use installation instructions remain on the TODO list. The tarball includes a set of packages and scripts. Whether you put it somewhere in your path or run them via absolute pathnames is up to you. My own system runs Ubuntu with Python 2.7, and this works fairly well, as long as one pays attention to the dependcies above. Meeting the dependencies on Mac OS is hard, and I have not had the patience to do it. Windows has not been attempted. Compilation ----------- The :mod:`pysteg.jpeg` package contains C code which must be compiled. The jpegObject module depends on the numpy C-API, which is relatively recent and whose support has changed in recent versions of Python. The Makefile has been written for Python 2.7 on Ubuntu, and on this system it is sufficient to run make all from the root directory. Acknowledgement *************** The :mod:`pysteg` package uses aggregates code from many sources. jpeglib JPEG encoding/decoding library from IJG. This is used in the jpeg subpackage, and has been modified only to avoid conflicts with header files used by the Python C-API. This is unmodified and has a more permissive licence. sompy A Self-organsing map module by Kyle Dickerson and others was the basis form itml.som module. This is used under the GPL licence (version not specified), and the module may be redistributed separately under any version of GPL. No other module currently depends on this. gaussion_kde The itml.kde module is based on scipy.stats.gaussian_kde of Robert Kern. It is used under the two-clause BSD licence used by scipy. libsvm The libsvm library of Chih-Chung Chang and Chih-Jen Lin is used, and the binary libraries must be aquired and compiled separately. The python API has been included and modified to make more of the code natively python. Please see appropriate copyright and licence notices included with the respective components. Licence and Copyright ********************* Apart from the third-party code included as mentioned above, the copyright to different components of the code is held by the University of Surrey, Høgskolen i Ålesund, or the author. See individual files for details. The software is made available for use under the GNU Public Licence Version 3. This software is distributed for your use, in the hope that it will be useful, but WITHOUT ANY WARRANTY. If you publish results based on the use of the software, please cite the distribution website http://www.ifs.schaathun.net/pysteg/.