Rate this post

Machine learning can be used to predict which of two classes it belongs to by looking at an input. Sentiment analysis and credit-card fraud detection are examples of practical uses. Such models are trained with datasets labeled with 1s and 0s representing the two classes and are often built with libraries.

Deep learning can be used to classify. Building a neural network that acts as a classifier is the same as building one that acts as a regression. In this post, you will learn how to use Keras. In my next post, I will show you how to create deep-learning models that perform multiclass classification.

## A Binary Classifier is being built.

You learned how to build a neural network to solve a regression problem in the previous post. The network had an input layer that accepted distances to travel, hours of day and week, and a predicted taxi fare. Here is how that network was defined.

model = Sequential()

Two simple changes are needed to build a neural network.

• The sigmoid activation function should be added to the output layer. Sigmoid reduces the output to a value. A probability is 0 See my post on what a sigmoid function does.
• The loss function can be changed to a purpose-built one. The history object returned by fit will capture the accuracies computed by the loss function.

This network is designed to perform classification rather than regression.

model = Sequential()

That is it. That is all it takes to create a neural network. You still call fit to train the network, and you use the returned history object to plot the training and validation accuracy to see how well the network fit to the data.

You build a neural network by including a single neuron with sigmoid activation in the output layer and specifying the loss function. The network ‘s output is a probability. The input belongs to the positive class. Does n’t get much simpler than that!

The cross-entropy loss function increases the penalty for wrong outputs to drive the weights and biases more aggressively in the right direction.

The network predicts that the sample will be a 1 if it is a positive class. The log loss is the cross-entropy loss. Which is zero. If the network outputs a zero. The error is for the same sample. Which equals 1 The penalty is higher if the predicted probability is wrong. The network says the probability is a mere 0 if the sample is a 1 The cross-entropy loss is zero. Or 4. Cross-entropy loss pats the optimizer on the back when it is close to the right answer and slaps it on the hand when it is not. The harder the slap, the worse the prediction.

## Making predictions.

A neural network can easily fit non- linear data. The network is the learning model and you do n’t have to worry about trying different learning models. An example would be the dataset below, in which each data point consists of an x–y coordinate pair and belongs to one of two classes. A neural network is trained to predict a class based on a point ‘s x and y coordinates.

model = Sequential()
hist = model.fit(x, y, epochs=40, batch_size=10, validation_split=0.2)

The network has just one hidden layer. A plot of the training and validation accuracy shows that it is successful in separating the classes. The predict method is used to make predictions. Predict returns a number from 0 thanks to the sigmoid activation function. There is a chance that the input belongs to the positive class. The negative class is represented by purple data points, while the positive class is represented by red data points. The network is asked to predict the probability of a data point. It ‘s a red class.

model.predict(np.array([[-0.5, 0.0]]))

The answer is zero. 57 indicates that. It ‘s more likely to be red than purple. Do it this way if you want to know which class the point is in.

(model.predict(np.array([[-0.5, 0.0]])) > 0.5).astype(‘int32’)

The answer is 1, which is red. The predict_classes method that did the same without the astype cast was recently removed from older versions of Keras.

## Credit-card fraud can be detected with a Neural Network.

There is a neural network that can detect credit-card fraud. Download the zip file and extract the credit card from it. There is a csv from the zip file. IZIP it up before I check it in because the file is larger than the 100 MB limit. In my post on PCA-based anomalies detection, I presented the same dataset. There is information about 284,808 actual credit-card transactions, including the amount of each transaction and a label : 0 for legitimate transactions and 1 for fraudulent transactions. The meaning of 28 columns named “ V1 ” through “ V28 ” has been obfuscated with principal component analysis. 492 examples of fraudulent transactions are contained in the dataset. Creditcard can be dropped. There is a directory where your Jupyter notebooks are hosted. The following code is used to load the dataset.

import pandas as pd

The following statements can be used to split the dataset into two parts : one for training and one for testing. We will do the split ourselves so we can run the test data through the network and use a confusion matrix to analyze the results.

from sklearn.model_selection import train_test_split

x = df.drop([‘Time’, ‘Class’], axis=1)
y = df[‘Class’]

x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.2, stratify=y, random_state=0)

Neural network for classification

from keras.models import Sequential
from keras.layers import Dense

model = Sequential()
model.summary()

The model will be trained the next step. The test data split off from the larger dataset is used to assess the model ‘s accuracy as training takes place.

hist = model.fit(x_train, y_train, validation_data=(x_test, y_test), epochs=10, batch_size=100)

Plot the training and validation accuracy using the per-epoch values.

import matplotlib.pyplot as plt
%matplotlib inline
import seaborn as sns
sns.set()

acc = hist.history[‘accuracy’]
val = hist.history[‘val_accuracy’]
epochs = range(1, len(acc) + 1)

plt.plot(epochs, acc, ‘-‘, label=’Training accuracy’)
plt.plot(epochs, val, ‘:’, label=’Validation accuracy’)
plt.title(‘Training and Validation Accuracy’)
plt.xlabel(‘Epoch’)
plt.ylabel(‘Accuracy’)
plt.legend(loc=’lower right’)
plt.plot()

It looked like this for me. Your results will be different because of the randomness inherent to training neural networks. The validation accuracy is on the surface. It appears to be very high. We are dealing with an unbalanced dataset. There are less than 0 fraudulent transactions. The model could just guess that every transaction is legitimate and get it right 99 percent of the time. 8 % of the time. A confusion matrix can be used to visualize how the model performs during testing.

from sklearn.metrics import confusion_matrix

y_predicted = model.predict(x_test) > 0.5
mat = confusion_matrix(y_test, y_predicted)
labels = [‘Legitimate’, ‘Fraudulent’]

sns.heatmap(mat, square=True, annot=True, fmt=’d’, cbar=False, cmap=’Blues’,
xticklabels=labels, yticklabels=labels)

plt.xlabel(‘Predicted label’)
plt.ylabel(‘Actual label’)

You can not use the plot_confusion_matrix function here, but you can use the confusion_matrix function to plot it yourself. Here is how it turned out for me. Your results are likely to be different. You will get different results if you train the model multiple times. In this run, the model misclassifies legitimate transactions just 4 times. The transactions are classified correctly more than 99. 99 % of the time. 70 % of the fraudulent transactions were caught by the model. Credit-card companies would rather allow 100 fraudulent transactions to go through than decline one legitimate transaction. The latter creates unhappy customers.

## Get the code.

There is a fraud-detection example in the Jupyter notebook that you can download. You can check out the other notebooks in the repo. I am constantly uploading new samples and updating existing ones so be sure to check back from time to time.