What is a convolutional neural network (CNN)?

 What is a convolutional neural network (CNN)?


Defining convolutional neural networks (or CNNs) is essential to understanding their role in computer vision. They are revolutionizing the way we process images and visual features.

Traditional image classification methods

Traditionally, to classify images, image processing techniques were used which required the features of each image to be extracted from the dataset. A classifier was then trained on these features. The performance of this process depended largely on the quality of the extracted features. Several feature extraction and description methods were used, such as the SIFT algorithm, which was popular for its ability to detect and describe relevant features.

However, it's important to note that, in practice, perfect classification is rarely achieved, as classification error is never zero. To improve results, we could either develop new feature extraction methods specific to the images under study or opt for a "better" classifier.

The Emergence of AlexNet: A Revolution in 2012

However, in 2012, a major revolution occurred at the annual ILSVRC computer vision competition: a new Deep Learning algorithm called AlexNet set new records. Convolutional neural networks take a similar approach to traditional supervised learning methods for image classification. They take images as input, automatically detect their features, and train a classifier on these features. The crucial difference is that CNNs automatically learn these features during the training phase.

The Fundamentals of Convolutional Neural Networks


During training, the aim is to minimize the classification error by adjusting both the classifier parameters and the features extracted from the images. In addition, the specific architecture of CNNs enables the extraction of features of varying complexity, from the simplest to the most sophisticated. This capacity for automatic, hierarchical feature extraction, adapted to the given problem, is one of the major strengths of convolutional neural networks. No need to manually implement extraction algorithms such as SIFT or Harris-Stephens.

Unlike traditional supervised learning methods, where features are extracted manually, CNNs learn features directly from images, which is their distinctive advantage. Convolutional neural networks, also known as CNNs or ConvNet, remain today's most powerful models for image classification.

CNN Architecture: Automatic Feature Extraction and the CNN Classification Process

The distinctive feature of CNNs is their multi-block architecture. The first block functions as a feature extractor, using convolution filtering operations to perform template matching on the image. The extracted features are then normalized and resized.


This process can be repeated several times, applying new convolution kernels to obtain new features to normalize and resize. Finally, the values of the last features are concatenated into a vector. This vector is the output of the first block and the input of the second.

The second block, although not specific to CNNs, is generally found at the end of all neural networks used for classification. The values of the input vector are transformed using linear combinations and activation functions to produce a new output vector. This latter vector represents the probability of the image belonging to each class. Each element of this vector is a probability between 0 and 1, and the sum of all elements equals 1. These probabilities are calculated by the last layer of the network, which uses a logistic function for binary classification or a softmax function for multi-class classification as the activation function.

As with conventional neural networks, layer parameters are adjusted by gradient backpropagation during the training phase. However, in the case of CNNs, these parameters are mainly linked to image characteristics.

Today, convolutional neural networks, also known as CNNs or ConvNet, remain the models of choice for image classification, thanks to their ability to automatically learn features from raw data. We'll be coming back to this topic shortly for further clarification and details.

See also: 

The Future of Humanity and Artificial Intelligence (AI)



Comments