close

[Solved] Pandas unstack problems: Value: Index contains duplicate entries, cannot reshape

Hello Guys, How are you all? Hope You all Are Fine. Today I get the following error Pandas unstack problems: Value: Index contains duplicate entries, cannot reshape in python. So Here I am Explain to you all the possible solutions here.

Without wasting your time, Let’s start This Article to Solve This Error.

How Pandas unstack problems: Value: Index contains duplicate entries, cannot reshape Error Occurs?

Today I get the following error Pandas unstack problems: Value: Index contains duplicate entries, cannot reshape in python.

How To Solve Pandas unstack problems: Value: Index contains duplicate entries, cannot reshape Error ?

  1. How To Solve Pandas unstack problems: Value: Index contains duplicate entries, cannot reshape Error ?

    To Solve Pandas unstack problems: Value: Index contains duplicate entries, cannot reshape Error You can avoid this by retaining the default index column (row #) and while setting the index using “id“, “date” and “location“, add it in “append” mode instead of the default overwrite mode.

  2. Pandas unstack problems: Value: Index contains duplicate entries, cannot reshape

    To Solve Pandas unstack problems: Value: Index contains duplicate entries, cannot reshape Error You can avoid this by retaining the default index column (row #) and while setting the index using “id“, “date” and “location“, add it in “append” mode instead of the default overwrite mode.

Solution 1


Here’s an example DataFrame which show this, it has duplicate values with the same index. The question is, do you want to aggregate these or keep them as multiple rows?

In [11]: df
Out[11]:
   0  1  2      3
0  1  2  a  16.86
1  1  2  a  17.18
2  1  4  a  17.03
3  2  5  b  17.28

In [12]: df.pivot_table(values=3, index=[0, 1], columns=2, aggfunc='mean')  # desired?
Out[12]:
2        a      b
0 1
1 2  17.02    NaN
  4  17.03    NaN
2 5    NaN  17.28

In [13]: df1 = df.set_index([0, 1, 2])

In [14]: df1
Out[14]:
           3
0 1 2
1 2 a  16.86
    a  17.18
  4 a  17.03
2 5 b  17.28

In [15]: df1.unstack(2)
ValueError: Index contains duplicate entries, cannot reshape

One solution is to reset_index (and get back to df) and use pivot_table.

In [16]: df1.reset_index().pivot_table(values=3, index=[0, 1], columns=2, aggfunc='mean')
Out[16]:
2        a      b
0 1
1 2  17.02    NaN
  4  17.03    NaN
2 5    NaN  17.28

Another option (if you don’t want to aggregate) is to append a dummy level, unstack it, then drop the dummy level…

Solution 2


There’s a far more simpler solution to tackle this.

The reason why you get ValueError: Index contains duplicate entries, cannot reshape is because, once you unstack “Location“, then the remaining index columns “id” and “date” combinations are no longer unique.

You can avoid this by retaining the default index column (row #) and while setting the index using “id“, “date” and “location“, add it in “append” mode instead of the default overwrite mode.

So use,

e.set_index(['id', 'date', 'location'], append=True)

Once this is done, your index columns will still have the default index along with the set indexes. And unstack will work.

Let me know how it works out.

Summery

It’s all About this issue. Hope all solution helped you a lot. Comment below Your thoughts and your queries. Also, Comment below which solution worked for you? Thank You.

Also, Read