# Convolutions in Fastai

Chapter 13 of Fastbook dives into the details of convolutional neural networks and how the convolution operation that they're built on works.

A guide to convolution arithmetic for deep learning has excellent low-level details of how different types of convolutions work.

In this post, I explore how these different types of convolution operations can be applied with fastai.

## Convolution operations

The behavior of a convolutional layer depends on the following properties:

- kernel size
- stride
- padding

In addition to these, we'll also look at two more properties: `transpose`

and `dilation`

.

Let us create a `5x5`

tensor with random values to use as an input for the convolutional layers we will create.

```
from fastai.vision.all import *
t = torch.rand(5, 5).unsqueeze(0).unsqueeze(0)
t.shape
```

`torch.Size([1, 1, 5, 5])`

Convolution layers expect the first dimension to be the batch size, and the second to be the number of channels. We use the `unsqueeze`

function to add a dimension of `1`

for these.

### No padding and strides of size 1

*Image from https://github.com/vdumoulin/conv_arithmetic*

`conv1 = ConvLayer(1, 3, ks=3, stride=1, padding=0)`

- The first parameter specifies the
**number of channels in the input**. - The second parameter specifies
**how many filters**we want to create in this layer. This will be equal to the number of channels in the output since each channel is created by one filter. - The
`ks`

parameter specifies the**size of the filters**we want -`4x4`

. We need to provide just one integer because filters are always square. The default value of this parameter is 3. - The
`stride`

parameter specifies the**size of the stride**. The default value of this parameter is 1. - The
`padding`

parameter specifies**how much padding**we apply around the input.

```
res = conv1(t)
res.shape
```

`torch.Size([1, 3, 3, 3])`

The batch size in the output remains the same as that in the input.

The number of channels has increased to 3 since we created a convolutional layer with 3 filters.

With no padding and strides of size 1, the dimensions of the output are `input size - kernel size + 1`

, which in our case is `5 - 4 + 1 = 2`

.

### Zero padding and strides of size 1

*Image from https://github.com/vdumoulin/conv_arithmetic*

`conv2 = ConvLayer(1, 3)`

We don't specify the stride parameter since its default value is 1.

We haven't specified the `ks`

parameter so the default value of 3 is used.

```
res = conv2(t)
res.shape
```

`torch.Size([1, 3, 5, 5])`

**fastai automatically applies an appropriate amount of padding by default** if we are not using transposed convolutions (more about that in a bit) to ensure that our input and output dimensions are equal.

This kind of padding is also commonly known as **half padding** and **same padding**.

There is another type of padding called **full padding** which allows us to increase the dimensions of the output. Full padding can be achieved by using regular zero padding of size `k-1`

(where `k`

is the kernel size).

*Image from https://github.com/vdumoulin/conv_arithmetic*

```
conv3 = ConvLayer(1, 3, padding=(2,2))
res = conv3(t)
res.shape
```

`torch.Size([1, 3, 7, 7])`

With full padding, the dimensions of the output are `input size + kernel size - 1`

, which in this case is `5 + 3 - 1 = 7`

.

### Strided convolutions

By specifying a value for the `stride`

parameter greater than 1, we can perform strided convolutions.

Strided convolutions are useful for descreasing the dimensions of the output.

*Image from https://github.com/vdumoulin/conv_arithmetic*

```
conv4 = ConvLayer(1, 3, stride=2)
res = conv4(t)
res.shape
```

`torch.Size([1, 3, 3, 3])`

We can also have strided convolutions with no padding applied to the input.

*Image from https://github.com/vdumoulin/conv_arithmetic*

```
conv5 = ConvLayer(1, 3, stride=2, padding=0)
res = conv5(t)
res.shape
```

`torch.Size([1, 3, 2, 2])`

## Transposed Convolutions

Also known as: fractionally strided convolutions, deconvolutions.

Transposed convolutions allow us to increase the dimension of the output compared to the input.

*Image from https://github.com/vdumoulin/conv_arithmetic*

`conv6 = ConvLayer(1, 3, transpose=True)`

We can use transposed convolutions in fastai by setting `transpose=True`

.

```
res = conv6(t)
res.shape
```

`torch.Size([1, 3, 7, 7])`

We can also use a stride bigger than 1. To visualize this operation, imagine adding zero padding between all values in the input.

*Image from https://github.com/vdumoulin/conv_arithmetic*

```
conv7 = ConvLayer(1, 3, transpose=True, stride=2)
res = conv7(t)
res.shape
```

`torch.Size([1, 3, 11, 11])`

## Dilated Convolutions

Regular convolution operations work on elements in the input that are next to each other. Dilated convolutions skip elements in the input.

*Image from https://github.com/vdumoulin/conv_arithmetic*

`conv8 = ConvLayer(1, 3, dilation=2)`

The number of elements to skip in dilated convolutions is controlled by the `dilation`

parameter.

A value of `dilation=1`

corresponds to normal convolutions.

```
res = conv8(t)
res.shape
```

`torch.Size([1, 3, 3, 3])`