Support Vector Machines: A Step-by-Step Introduction
What was the motivation for this tutorial?
This tutorial is my attempt to present the SVMs in a way so that anyone with a passing knowledge of computer science can understand and use it.
What is a support vector machine?
A support vector machine is a supervised machine learning algorithm used for data classification and estimating the relationships between variables (regression analysis). It’s a “supervised” algorithm because there’s an initial training phase involved where you feed the algorithm data that has already been classified (labeled). After this initial training phase is completed, future data sets given to the algorithm can be classified with no or minimal human intervention.
What are the advantages of using a support vector machine?
Many learning algorithms can only do linear classification, using a straight line to separate the data points. But there are algorithms, support vector machines being one of them, that can also do non-linear classification using a kernel method.
A kernel method, in short, is an equation that can pull data points apart into 3-dimensional space, and, instead of using a line as a separator, it uses something called a hyperplane, that, from a vertical standpoint, can take nonlinear forms. Nonlinear classification provides a more sophisticated way to classify complex data sets that can’t easily be separated by a straight line.
What’s the fastest way to get started with support vector machines?
How do I use libsvm?
• For Windows: Download and extract the libsvm zip file, and move the windows files into the folder where your data files are located.
• For Ubuntu Linux: Enter sudo apt-get install libsvm-tools into the Terminal.
• In the command line (cmd in Winndows, Terminal in Linux), run the svm-train executable on your training data (ex. a1a. / rename this file to a1a.train for convenience) to create a model of your data. Use example: svm-train a1a.train
• Run the svm-predict executable on test data (ex. ata.t / rename this file to a1a.test for convenience) and on the model file created by svm-train. End the command with the name of the predictions output file you’d like for it to create. Use example: svm-predict a1a.test a1a.train.model a1a.out
• The 1 and -1 classification labels in the output file correlate with the order of the data entries in the test file, and were learned from the training file.
Where should I go if I want to learn more?
• Knowledge Discovery with Support Vector Machines by Lutz H. Hamel