In machine learning (ML), the situation when the model does not generalize well from the training data to unseen data is calledoverfitting. As you might know, it is one of the trickiest obstacles in applied machine learning.
The first step in tackling this problem is to actually know that your model isoverfitting.That is where propercross-validationcomes in.
After identifying the problem you can prevent it from happening by applying regularization or training with more data. Still, sometimes you might not have additional data to add to your initial dataset. Acquiring and labeling additional data points may also be the wrong path. Of course, in many cases, it will deliver better results, but in terms of work, it is often time-consuming and expensive.
Data augmentationis a technique that can be used to artificially expand the size of a training set by creating modified data from the existing one. It is a good practice to useDAif you want to preventoverfitting, or the initial dataset is too small to train on, or even if you want to squeeze better performance from your model.
Let’s make this clear,data augmentationis not only used to preventoverfitting. In general, having a large dataset is crucial for the performance of bothMLandDeep Learning(DL) models. However, we canimprove the performance of the modelby augmenting the data we already have. It means thatdata augmentationis also good for enhancing the model’s performance.
In general,DAis frequently used when building aDLmodel. That is why throughout this article, we will mostly talk about performingdata augmentationwith variousDLframeworks. Still, you should keep in mind that you can augment the data for theMLproblems as well.
You can augment:
Audio
Text
Images
Any other types of data
We will focus on image augmentations as those are the most popular ones. Nevertheless, augmenting other types of data is as efficient and easy. That is why it’s good to remember some common techniques which can be performed to augment the data.
Data augmentation techniques
We can apply various changes to the initial data. For example, for images, we can use:
Geometric transformations– you can randomly flip, crop, rotate or translate images, and that is just the tip of the iceberg
Color space transformations– change RGB color channels, intensify any color
Kernel filters– sharpen or blur an image
Random Erasing– delete a part of the initial image
Mixing images– basically, mix images with one another. Might be counterintuitive, but it works
For text there are:
Word/sentence shuffling
Word replacement– replace words with synonyms
Syntax-tree manipulation– paraphrase the sentence to be grammatically correct using the same words
Moreover, the greatest advantage of the augmentation techniques is that you may use all of them at once. Thus, you may get plenty of unique samples of data from the initial one.
Image data augmentation in Deep Learning
As mentioned above, inDeep Learning,data augmentationis a common practice. Therefore, every DL framework has its own augmentation methods or even a whole library. For example, let’s see how to apply image augmentations using built-in methods in TensorFlow (TF) and Keras, PyTorch, andMxNet.
Data augmentation in TensorFlow and Keras
To augment images when usingTensorFloworKerasas ourDLframework, we can:
Write our own augmentation pipelines or layers usingtf.image.
UseKeraspreprocessing layers
UseImageDataGenerator
Tf.image
Let’s take a closer look on the first technique and define a function that will visualize an image and then apply the flip to that image usingtf.image. You may see the code and the result below.
For finer control, you can write your own augmentation pipeline. In most cases, it is useful to apply augmentations on a whole dataset, not a single image. You can implement it as follows.
Of course, that is just the tip of the iceberg.TensorFlowAPI has plenty of augmentation techniques. If you want to read more on the topic please check theofficial documentationorother articles.
Keras preprocessing
As mentioned above,Kerashas a variety of preprocessing layers that may be used fordata augmentation. You can apply them as follows.
data_augmentation = tf.keras.Sequential([
layers.experimental.preprocessing.RandomFlip("horizontal_and_vertical"),
layers.experimental.preprocessing.RandomRotation(0.2)])
image = tf.expand_dims(image, 0)
plt.figure(figsize=(10, 10))
for i in range(9):
augmented_image = data_augmentation(image)
ax = plt.subplot(3, 3, i + 1)
plt.imshow(augmented_image[0])
plt.axis("off")
Keras ImageDataGenerator
Also, you may useImageDataGenerator(tf.keras.preprocessing.image.ImageDataGenerator) that generates batches of tensor images with real-timeDA.
datagen = ImageDataGenerator(rotation_range=90)
datagen.fit(x_train)
for X_batch, y_batch in datagen.flow(x_train, y_train, batch_size=9):
for i in range(0, 9):
pyplot.subplot(330 + 1 + i)
pyplot.imshow(X_batch[i].reshape(img_rows, img_cols, 3))
pyplot.show()
break
May be useful
Check how you can track your model-building metadata (like parameters, metrics, learning rate, hardware consumption, and more) using Neptune-Keras integration.
Data augmentation in PyTorch and MxNet
Transforms in Pytorch
Transformslibrary is the augmentation part of thetorchvisionpackage that consists of popular datasets, model architectures, and common image transformations forComputer Visiontasks.
To installTransformsyou simply need to installtorchvision:
pip3 install torch torchvision
Transformslibrary contains different image transformations that can be chained together using theComposemethod. Functionally,Transforms has a variety of augmentation techniques implemented. You can combine them by usingComposemethod. Just check theofficial documentationand you will certainly find the augmentation for your task.
Additionally, there is thetorchvision.transforms.functionalmodule. It has various functional transforms that give fine-grained control over the transformations. It might be really useful if you are building a more complex augmentation pipeline, for example, in the case of segmentation tasks.
Besides that,Transformsdoesn’t have a unique feature. It’s used mostly withPyTorchas it’s considered a built-in augmentation library.
May be useful
Check how you can track your model-building metadata (like parameters, losses, metrics, images and predictions, and more) usingNeptune-PyTorch integration.
Sample usage of PyTorch Transforms
Let’s see how to apply augmentations usingTransforms. You should keep in mind thatTransformsworks only withPILimages. That is why you should either read an image inPILformat or add the necessary transformation to your augmentation pipeline.
from torchvision import transforms as tr
from torchvision.transfroms import Compose
pipeline = Compose(
[tr.RandomRotation(degrees = 90),
tr.RandomRotation(degrees = 270)])
augmented_image = pipeline(img = img)
Sometimes you might want to write a customDataloaderfor the training. Let’s see how to apply augmentations viaTransformsif you are doing so.
from torchvision import transforms
from torchvision.transforms import Compose as C
def aug(p=0.5):
return C([transforms.RandomHorizontalFlip()], p=p)
class Dataloader(object):
def __init__(self, train, csv, transform=None):
...
def __getitem__(self, index):
...
img = aug()(**{'image': img})['image']
return img, target
def __len__(self):
return len(self.image_list)
trainset = Dataloader(train=True, csv='/path/to/file/', transform=aug)
Transforms in MxNet
Mxnetalso has a built-in augmentation library calledTransforms(mxnet.gluon.data.vision.transforms). It is pretty similar toPyTorch Transformslibrary. There is pretty much nothing to add. Check theTransformssection above if you want to find more on this topic. General usage is as follows.
Those are nice examples, but from my experience, the real power ofdata augmentationcomes out when you are using custom libraries:
They have a wider set of transformation methods
They allow you to create custom augmentation
You can stack one transformation with another.
That is why using customDAlibraries might be more effective than using built-in ones.
Image data augmentation libraries
In this section, we will talk about the following libraries :
Augmentor
Albumentations
Imgaug
AutoAugment (DeepAugment)
We will look at the installation, augmentation functions, augmenting process parallelization, custom augmentations, and provide a simple example. Remember that we will focus on image augmentation as it is most commonly used.
Before we start I have a few general notes, about using custom augmentation libraries with differentDLframeworks.
In general, all libraries can be used with all frameworks if you perform augmentation before training the model.
The point is that some libraries have pre-existing synergy with the specific framework, for example,AlbumentationsandPytorch. It’s more convenient to use such pairs. Still, if you need specific functional or you like one library more than another you should either performDAbefore starting to train a model or write a custom Dataloader and training process instead.
The second major topic is using custom augmentations with different augmentation libraries. For example, you want to use your own CV2 image transformation with a specific augmentation fromAlbumentationslibrary.
Let’s make this clear, you can do that with any library, but it might be more complicated than you think. Some libraries have a guide in their official documentation of how to do it, but others do not.
If there is no guide, you basically have two ways:
Apply augmentations separately, for example, use your transformation operation and then the pipeline.
CheckGithubrepositories in case someone has already figured out how to integrate a custom augmentation to the pipeline correctly.
Ok, with that out of the way, let’s dive in.
Augmentor
Moving on to the libraries,Augmentoris a Python package that aims to be both adata augmentationtool and a library of basic image pre-processing functions.
It is pretty easy to installAugmentorvia pip:
pip install Augmentor
If you want to build the package from the source, please, check theofficial documentation.
In general,Augmentorconsists of a number of classes for standard image transformation functions, such asCrop,Rotate,Flip, and many more.
Augmentorallows the user to pick a probability parameter for every transformation operation. This parameter controls how often the operation is applied. Thus,Augmentorallows forming an augmenting pipeline that chains together a number of operations that are applied stochastically.
This means that each time an image is passed through the pipeline, a completely different image is returned. Depending on the number of operations in the pipeline and the probability parameter, a very large amount of new image data can be created. Basically, that isdata augmentationat its best.
What can we do with images usingAugmentor?Augmentoris more focused ongeometric transformationthough it has other augmentations too. The main features ofAugmentorpackage are:
Perspective skewing– look at an image from a different angle
Elastic distortions– add distortions to an image
Rotating– simply, rotate an image
Shearing– tilt an image along with one of its sides
Cropping– crop an image
Mirroring– apply different types of flips
Augmentoris a well-knit library. You can use it with variousDLframeworks (TF, Keras, PyTorch, MxNet) because augmentations may be applied even before you set up a model.
Moreover,Augmentorallows you to add custom augmentations. It might be a little tricky as it requireswritinga new operation class, but you can do that.
Unfortunately, Augmentor isneither extremely fast nor flexible functional wise. There are libraries that have more transformation functions available and can performDAway faster and more effectively. That is whyAugmentoris probably the least popularDAlibrary.
Sample usage of Augmentor
Let’s check the simple usage ofAugmentor:
We need to import it.
We create an empty augmenting pipeline.
Add some operations in there
Usesamplemethod to get the augmented images.
Please pay attention when usingsampleyou need to specify the number of augmented images you want to get.
Albumentationsis a computer vision tool designed to perform fast and flexible image augmentations. It appears to have the largest set of transformation functions of all image augmentation libraries.
Let’s installAlbumentationsvia pip. If you want to do it somehow else, check theofficial documentation.
pip install albumentations
Albumentationsprovides a single and simple interface to work with different computer vision tasks such as classification, segmentation, object detection, pose estimation, and many more. The library isoptimized for maximum speed and performanceand has plenty of different image transformation operations.
If we are talking about data augmentations, there is nothingAlbumentationscan not do. To tell the truth,Albumentationsis the most stacked library as it does not focus on one specific area of image transformations. You can simply check the official documentation and you will find an operation that you need.
Moreover,Albumentationshas seamless integration with deep learning frameworks such asPyTorchandKeras. The library is a part of thePyTorchecosystem but you can use it withTensorFlowas well. Thus,Albumentationsis the most commonly used image augmentation library.
On the other hand,Albumentationsis not integrated withMxNet, which means if you are usingMxNetas aDLframework you should write a custom Dataloader or use another augmentation library.
It’s worth mentioning thatAlbumentationsis an open-source library. You can easily check the originalcodeif you want to.
Sample usage of Albumentations
Let’s see how to augment an image usingAlbumentations. You need to define the pipeline using theComposemethod (or you can use a single augmentation), pass an image to it, and get the augmented one.
Now, after reading aboutAugmentorandAlbumentationsyou might think all image augmentation libraries are pretty similar to one another.
That is right. In many cases, the functionality of each library is interchangeable. Nevertheless, each one has its own key features.
ImgAug is also a library for image augmentations. It is prettysimilar to Augmentor and Albumentationsfunctional wise, but the main feature stated in the officialImgAugdocumentation is theability to execute augmentations on multiple CPU cores. If you want to do that you might want to check the followingguide.
As you may see, this’s pretty different from theAugmentors focus on geometrictransformations orAlbumentationsattempting to cover all augmentationspossible.
Nevertheless,ImgAug’skey feature seems a bit weird as bothAugmentorandAlbumentationscan be executed on multiple CPU cores as well. AnywayImgAugsupports a wide range of augmentation techniques just like Albumentations and implements sophisticated augmentation with fine-grained control.
Like other image augmentation libraries,ImgAugis easy to use. To define an augmenting pipeline use theSequentialmethod and then simply stack different transformation operations like in other libraries.
from imgaug import augmenters as iaa
seq = iaa.Sequential([
iaa.Crop(px=(0, 16)),
iaa.Fliplr(0.5),
iaa.GaussianBlur(sigma=(0, 3.0))])
for batch_idx in range(1000):
images = load_batch(batch_idx)
images_aug = seq(images=images)
Autoaugment
On the other hand, Autoaugment is something more interesting. As you might know, usingMachine Learning(ML) to improveMLdesign choices has already reached the space ofDA.
In 2018 Google has presented Autoaugmentalgorithm which isdesigned to search for the best augmentationpolicies.Autoaugment helped to improve state-of-the-art model performanceon such datasets asCIFAR-10, CIFAR-100, ImageNet, and others.
Still,AutoAugment is tricky to use, as it does not provide the controller module, which prevents users from running it for their own datasets. That is why using AutoAugment might be relevant only if it already has the augmentation strategies for the dataset we plan to train on and the task we are up to.
Thereby let us take a closer look atDeepAugmentthat is a bit faster and more flexible alternative toAutoAugment.DeepAugmenthas no strong connection toAutoAugmentbesides the general idea and was developed by a group of enthusiasts. You can install it via pip:
pip install deepaugment
It’s important for us to know how to useDeepAugmentto get the best augmentation strategies for our images. You may do it as follows or check outthe official Github repository.
Please, keep in mind thatwhen you use optimize method you should specify the number of samplesthat will be used to find the best augmentation strategies.
Overall, bothAutoAugmentandDeepAugmentare not commonly used. Still, it might be quite useful to run them if you have no idea of what augmentation techniques will be the best for your data. You should only keep in mind that it will take plenty of time because multiple models will be trained.
It’s worth mentioning that we have not covered all custom image augmentation libraries, but we have covered the major ones. Now you know what libraries are the most popular, what advantages and disadvantages they have, and how to use them. This knowledge will help you to find any additional information if you need so.
Speed comparison of image data augmentation libraries
As you may have already figured out, the augmentation process is quite expensive time- and computation-wise.
The time needed to performDAdepends on the number of data points we need to transform, on the overall augmenting pipeline difficulty, and even on the hardware that you use to augment your data.
Let’s run some experiments to find out the fastest augmentation library. We will perform these experiments forAugmentor,Albumentations,ImgAug, andTransforms. We will use animage datasetfrom Kaggle that is made for flower recognition and contains over four thousand images.
For our first experiment, we will create an augmenting pipeline that consists only of two operations. These will beHorizontal Flipwith 0.4 probability andVertical Flipwith 0.8 probability. Let’s apply the pipeline to every image in the dataset and measure the time.
Time (seconds)
Augmentor
31.9
Albumentations
10.9
ImgAug
12.04
Transforms
9.8
As we have anticipated,Augmentor performs way slower than other libraries. Still, bothAlbumentations and Transforms show a good resultas they are optimized to perform fast augmentations.
For our second experiment, we willcreate a more complex pipeline with various transformationstosee ifTransformsand Albumentationsstay at the top. We will stack more geometric transformations as a pipeline. Thus, we will be able to use all libraries as Augmentor, for example, doesn’t have much kernel filter operations.
You may find the full pipelinein the notebookthat I’ve prepared for you. Please, feel free to experiment and play with it.
Time (seconds)
Augmentor
28.2
Albumentations
17.7
ImgAug
30.9
Transforms
15.2
Once moreTransforms and Albumentations are at the top.
Moreover, I usedneptune.aito compare the CPU usage. if we check the CPU-usage graphs in the Neptune app,we will find out that bothAlbumentations and Transformsuse less than 60% of CPU resources.
CPU usage
On the other hand,Augmentor and ImgAuguse more than 80%.
CPU usage
As you may have noticed, bothAlbumentations and Transforms are really fast. That is why they are commonly used in real life.
May be useful
When you track your ML experiments withneptune.ai, the system metrics are logged automatically by default. This includes hardware consumption (CPU, GPU, and memory) as well as console logs (stdout, stderr). Seewhat else you can track and display in the Neptune app.
Best practices, tips, and tricks
It’s worth mentioning that despiteDAbeing a powerful tool you should use it carefully. There are some general rules that you might want to follow when applying augmentations:
Choose proper augmentations for your task. Let’s imagine that you are trying to detect a face on an image. You chooseRandom Erasingas an augmentation technique and suddenly your model does not perform well even on training. That is because there is no face on an image as it was randomly erased by the augmentation technique. The same thing is with voice detection and applying noise injection to the tape as an augmentation. Keep these cases in mind and be logical when choosingDAtechniques.
Do not use too many augmentations in one sequence. You may simply create a totally new observation that has nothing in common with your original training (or testing data)
Display augmented data (images and text) in the notebook and listen to the converted audio sample before starting training on them. It’s quite easy to make a mistake when forming an augmenting pipeline. That is why it’s always better to double-check the result.
Also, it’s a great practice to checkKagglenotebooks before creating your own augmenting pipeline. There are plenty of ideas you may find there. Try to find a notebook for a similar task and check if the author applied the same augmentations as you’ve planned.
Final thoughts
In this article, we have figured out whatdata augmentationis, whatDAtechniques are there, and what libraries you can use to apply them.
To my knowledge, the best publically available library isAlbumentations. That is why if you are working with images and do not useMxNetorTensorFlowas yourDLframework, you should probably useAlbumentationsforDA.
Hopefully, with this information, you will have no problems setting up theDAfor your next machine learning project.