close

[Solved] UnicodeDecodeError: ‘utf8’ codec can’t decode byte 0xe9 in position 10: invalid continuation byte

Hello Guys, How are you all? Hope You all Are Fine. I am trying to decode the string but it gives me the following error UnicodeDecodeError: ‘utf8’ codec can’t decode byte 0xe9 in position 10: invalid continuation byte in Python. So Here I am Explain to you all the possible solutions here.

Without Wasting your time, Lets start This Article to Solve This Error.

How This UnicodeDecodeError: ‘utf8’ codec can’t decode byte 0xe9 in position 10: invalid continuation byte Error Occurs ?

I am Trying to decode my string. Here is my code.

myString = b"Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been \xe9 the industry's standard dummy text ever since the 1500s"

decodedValue = myString.decode("utf-8")
print(decodedValue)

Which results in the following error.

Traceback (most recent call last):
  File ".\testfile.py", line 6, in <module>
    decodedValue = myString.decode("utf-8")
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe9 in position 96: invalid continuation byte

How To Solve UnicodeDecodeError: ‘utf8’ codec can’t decode byte 0xe9 in position 10: invalid continuation byte Error ?

Question: How To Solve UnicodeDecodeError: ‘utf8’ codec can’t decode byte 0xe9 in position 10: invalid continuation byte Error ?
Answer: To Solve UnicodeDecodeError: ‘utf8’ codec can’t decode byte 0xe9 in position 10: invalid continuation byte Error You just have to change the encoding as latin-1 will solve your error. The second solution is Just to use both utf-8 and latin-1. It also will solve your error.

Solution 1

Here You have a string that is almost certainly encoded in Latin 1. You can see how UTF-8 and Latin 1 looks different.

>>> u'\xe9'.encode('utf-8')
b'\xc3\xa9'
>>> u'\xe9'.encode('latin-1')
b'\xe9'

Solution 2

Here solution is to change the encoding as latin-1

pd.read_csv('ml-100k/u.item', encoding='latin-1')

Solution 3

Just use both utf-8 and latin-1. It will solve your error.

>>> "This is test \xe9 string".decode('latin-1').encode("utf-8")
'This is test \xe3\xa9 string'

Summery

It’s all About this issue. Hope all solution helped you a lot. Comment below Your thoughts and your queries. Also, Comment below which solution worked for you? Thank You.

Also Read

Leave a Comment