close

How to append rows in a pandas dataframe in a for loop?

Hello Guys, How are you all? Hope You all Are Fine. Today We Are Going To learn about How to append rows in a pandas dataframe in a for loop in Python. So Here I am Explain to you all the possible Methods here.

Without wasting your time, Let’s start This Article.

Table of Contents

How to append rows in a pandas dataframe in a for loop?

  1. How to append rows in a pandas dataframe in a for loop?

    In other words, do not form a new DataFrame for each row. Instead, collect all the data in a list of dicts, and then call df = pd.DataFrame(data) once at the end, outside the loop.

  2. append rows in a pandas dataframe in a for loop

    In other words, do not form a new DataFrame for each row. Instead, collect all the data in a list of dicts, and then call df = pd.DataFrame(data) once at the end, outside the loop.

Method 1

Suppose your data looks like this:

import pandas as pd
import numpy as np

np.random.seed(2015)
df = pd.DataFrame([])
for i in range(5):
    data = dict(zip(np.random.choice(10, replace=False, size=5),
                    np.random.randint(10, size=5)))
    data = pd.DataFrame(data.items())
    data = data.transpose()
    data.columns = data.iloc[0]
    data = data.drop(data.index[[0]])
    df = df.append(data)
print('{}\n'.format(df))
# 0   0   1   2   3   4   5   6   7   8   9
# 1   6 NaN NaN   8   5 NaN NaN   7   0 NaN
# 1 NaN   9   6 NaN   2 NaN   1 NaN NaN   2
# 1 NaN   2   2   1   2 NaN   1 NaN NaN NaN
# 1   6 NaN   6 NaN   4   4   0 NaN NaN NaN
# 1 NaN   9 NaN   9 NaN   7   1   9 NaN NaN

Then it could be replaced with

np.random.seed(2015)
data = []
for i in range(5):
    data.append(dict(zip(np.random.choice(10, replace=False, size=5),
                         np.random.randint(10, size=5))))
df = pd.DataFrame(data)
print(df)

In other words, do not form a new DataFrame for each row. Instead, collect all the data in a list of dicts, and then call df = pd.DataFrame(data) once at the end, outside the loop.

Each call to df.append requires allocating space for a new DataFrame with one extra row, copying all the data from the original DataFrame into the new DataFrame, and then copying data into the new row. All that allocation and copying makes calling df.append in a loop very inefficient. The time cost of copying grows quadratically with the number of rows. Not only is the call-DataFrame-once code easier to write, its performance will be much better — the time cost of copying grows linearly with the number of rows.

Method 2

A more compact and efficient way would be perhaps:

cols = ['frame', 'count']
N = 4
dat = pd.DataFrame(columns = cols)
for i in range(N):

    dat = dat.append({'frame': str(i), 'count':i},ignore_index=True)

output would be:

>>> dat
   frame count
0     0     0
1     1     1
2     2     2
3     3     3

Conclusion

It’s all About this issue. Hope all Methods helped you a lot. Comment below Your thoughts and your queries. Also, Comment below which Method worked for you? Thank You.

Also, Read