Convolutions in Fastai
Chapter 13 of Fastbook dives into the details of convolutional neural networks and how the convolution operation that they're built on works.
A guide to convolution arithmetic for deep learning has excellent low-level details of how different types of convolutions work.
In this post, I explore how these different types of convolution operations can be applied with fastai.
Convolution operations
The behavior of a convolutional layer depends on the following properties:
- kernel size
- stride
- padding
In addition to these, we'll also look at two more properties: transpose and dilation.
Let us create a 5x5 tensor with random values to use as an input for the convolutional layers we will create.
from fastai.vision.all import *
t = torch.rand(5, 5).unsqueeze(0).unsqueeze(0)
t.shapetorch.Size([1, 1, 5, 5])Convolution layers expect the first dimension to be the batch size, and the second to be the number of channels. We use the unsqueeze function to add a dimension of 1 for these.
No padding and strides of size 1

Image from https://github.com/vdumoulin/conv_arithmetic
conv1 = ConvLayer(1, 3, ks=3, stride=1, padding=0)- The first parameter specifies the number of channels in the input.
- The second parameter specifies how many filters we want to create in this layer. This will be equal to the number of channels in the output since each channel is created by one filter.
- The
ksparameter specifies the size of the filters we want -4x4. We need to provide just one integer because filters are always square. The default value of this parameter is 3. - The
strideparameter specifies the size of the stride. The default value of this parameter is 1. - The
paddingparameter specifies how much padding we apply around the input.
res = conv1(t)
res.shapetorch.Size([1, 3, 3, 3])The batch size in the output remains the same as that in the input.
The number of channels has increased to 3 since we created a convolutional layer with 3 filters.
With no padding and strides of size 1, the dimensions of the output are input size - kernel size + 1, which in our case is 5 - 4 + 1 = 2.
conv2 = ConvLayer(1, 3)We don't specify the stride parameter since its default value is 1.
We haven't specified the ks parameter so the default value of 3 is used.
res = conv2(t)
res.shapetorch.Size([1, 3, 5, 5])fastai automatically applies an appropriate amount of padding by default if we are not using transposed convolutions (more about that in a bit) to ensure that our input and output dimensions are equal.
This kind of padding is also commonly known as half padding and same padding.
There is another type of padding called full padding which allows us to increase the dimensions of the output. Full padding can be achieved by using regular zero padding of size k-1 (where k is the kernel size).

Image from https://github.com/vdumoulin/conv_arithmetic
conv3 = ConvLayer(1, 3, padding=(2,2))
res = conv3(t)
res.shapetorch.Size([1, 3, 7, 7])With full padding, the dimensions of the output are input size + kernel size - 1, which in this case is 5 + 3 - 1 = 7.
Strided convolutions
By specifying a value for the stride parameter greater than 1, we can perform strided convolutions.
Strided convolutions are useful for descreasing the dimensions of the output.

Image from https://github.com/vdumoulin/conv_arithmetic
conv4 = ConvLayer(1, 3, stride=2)
res = conv4(t)
res.shapetorch.Size([1, 3, 3, 3])We can also have strided convolutions with no padding applied to the input.

Image from https://github.com/vdumoulin/conv_arithmetic
conv5 = ConvLayer(1, 3, stride=2, padding=0)
res = conv5(t)
res.shapetorch.Size([1, 3, 2, 2])Transposed Convolutions
Also known as: fractionally strided convolutions, deconvolutions.
Transposed convolutions allow us to increase the dimension of the output compared to the input.

Image from https://github.com/vdumoulin/conv_arithmetic
conv6 = ConvLayer(1, 3, transpose=True)We can use transposed convolutions in fastai by setting transpose=True.
res = conv6(t)
res.shapetorch.Size([1, 3, 7, 7])We can also use a stride bigger than 1. To visualize this operation, imagine adding zero padding between all values in the input.

Image from https://github.com/vdumoulin/conv_arithmetic
conv7 = ConvLayer(1, 3, transpose=True, stride=2)
res = conv7(t)
res.shapetorch.Size([1, 3, 11, 11])Dilated Convolutions
Regular convolution operations work on elements in the input that are next to each other. Dilated convolutions skip elements in the input.

Image from https://github.com/vdumoulin/conv_arithmetic
conv8 = ConvLayer(1, 3, dilation=2)The number of elements to skip in dilated convolutions is controlled by the dilation parameter.
A value of dilation=1 corresponds to normal convolutions.
res = conv8(t)
res.shapetorch.Size([1, 3, 3, 3])