Hello Guys, How are you all? Hope You all Are Fine. Today We Are Going To learn about **How to use ‘collate_fn’ with dataloaders** **in Python**. So Here I am Explain to you all the possible Methods here.

Without wasting your time, Let’s start This Article.

Table of Contents

## How to use ‘collate_fn’ with dataloaders?

**How to use 'collate_fn' with dataloaders?**Basically, the

`collate_fn`

receives a list of tuples if your`__getitem__`

function from a Dataset subclass returns a tuple, or just a normal list if your Dataset subclass returns only one element.**use 'collate_fn' with dataloaders**Basically, the

`collate_fn`

receives a list of tuples if your`__getitem__`

function from a Dataset subclass returns a tuple, or just a normal list if your Dataset subclass returns only one element.

## Method 1

Basically, the `collate_fn`

receives a list of tuples if your `__getitem__`

function from a Dataset subclass returns a tuple, or just a normal list if your Dataset subclass returns only one element. Its main objective is to create your batch without spending much time implementing it manually. Try to see it as a glue that you specify the way examples stick together in a batch. If you don’t use it, PyTorch only put `batch_size`

examples together as you would using torch.stack (not exactly it, but it is simple like that).

Suppose for example, you want to create batches of a list of varying dimension tensors. The below code pads sequences with 0 until the maximum sequence size of the batch, that is why we need the collate_fn, because a standard batching algorithm (simply using `torch.stack`

) won’t work in this case, and we need to manually pad different sequences with variable length to the same size before creating the batch.

def collate_fn(data): """ data: is a list of tuples with (example, label, length) where 'example' is a tensor of arbitrary shape and label/length are scalars """ _, labels, lengths = zip(*data) max_len = max(lengths) n_ftrs = data[0][0].size(1) features = torch.zeros((len(data), max_len, n_ftrs)) labels = torch.tensor(labels) lengths = torch.tensor(lengths) for i in range(len(data)): j, k = data[i][0].size(0), data[i][0].size(1) features[i] = torch.cat([data[i][0], torch.zeros((max_len - j, k))]) return features.float(), labels.long(), lengths.long()

The function above is fed to the collate_fn param in the DataLoader, as this example:

DataLoader(toy_dataset, collate_fn=collate_fn, batch_size=5)

With this collate_fn function, you always gonna have a tensor where all your examples have the same size. So, when you feed your forward() function with this data, you need to use the length to get the original data back, to not use those meaningless zeros in your computation.

**Summery**

It’s all About this issue. Hope all Methods helped you a lot. Comment below Your thoughts and your queries. Also, Comment below which Method worked for you? Thank You.

**Also, Read**