Recognizing Handwritten Digits With Scikit-Learn
The handwritten digit recognition is the solution to this problem which uses the image of a digit and recognizes the digit present in the image.
Let’s start by importing libraries.
from sklearn import svm, datasets, metrics
from sklearn.model_selection import train_test_split
import matplotlib.pyplot as plt
%matplotlib inline
Load the digit dataset using load_digits()
function.
Classification:
To apply a classifier on this data, we need to flatten the images, turning each 2-D array of grayscale values from shape (8, 8)
into shape (64,)
. Subsequently, the entire dataset will be of shape (n_samples, n_features)
, where n_samples
is the number of images and n_features
is the total number of pixels in each image.
For large values of C, the optimization will choose a smaller-margin hyperplane if that hyperplane does a better job of getting all the training points classified correctly. Conversely, a very small value of C will cause the optimizer to look for a larger-margin separating hyperplane, even if that hyperplane misclassifies more points.
classification_report
builds a text report showing the main classification metrics.
Confusion matrix:
[[87 0 0 0 1 0 0 0 0 0]
[ 0 87 1 0 0 0 0 0 2 1]
[ 0 0 85 1 0 0 0 0 0 0]
[ 0 0 0 82 0 3 0 2 4 0]
[ 0 0 0 0 88 0 0 0 0 4]
[ 0 0 0 0 0 87 1 0 0 3]
[ 0 1 0 0 0 0 90 0 0 0]
[ 0 0 0 0 0 1 0 88 0 0]
[ 0 0 0 0 0 0 0 0 88 0]
[ 0 0 0 1 0 1 0 0 0 90]]