close

[Solved] How can I fix “Error tokenizing data” on pandas csv reader?

Hello Guys, How are you all? Hope You all Are Fine. Today I get the following error How can I fix “Error tokenizing data” on pandas csv reader? in python. So Here I am Explain to you all the possible solutions here.

Without wasting your time, Let’s start This Article to Solve This Error.

How can I fix “Error tokenizing data” on pandas csv reader Error Occurs?

Today I get the following error How can I fix “Error tokenizing data” on pandas csv reader? in python.

How To Solve can I fix “Error tokenizing data” on pandas csv reader Error ?

  1. How To Solve can I fix “Error tokenizing data” on pandas csv reader Error ?

    To Solve can I fix “Error tokenizing data” on pandas csv reader Error To solve it, try specifying the sep and/or header arguments when calling read_csv.

  2. How can I fix “Error tokenizing data” on pandas csv reader?

    To Solve can I fix “Error tokenizing data” on pandas csv reader Error To solve it, try specifying the sep and/or header arguments when calling read_csv.

Solution 1

you could also try;

data = pd.read_csv('file1.csv', error_bad_lines=False)

Do note that this will cause the offending lines to be skipped.

Solution 2

It might be an issue with

  • the delimiters in your data
  • the first row, as @TomAugspurger noted

To solve it, try specifying the sep and/or header arguments when calling read_csv. For instance,

df = pandas.read_csv(filepath, sep='delimiter', header=None)

In the code above, sep defines your delimiter and header=None tells pandas that your source data has no row for headers / column titles. Thus saith the docs: “If file contains no header row, then you should explicitly pass header=None”. In this instance, pandas automatically creates whole-number indices for each field {0,1,2,…}.

According to the docs, the delimiter thing should not be an issue. The docs say that “if sep is None [not specified], will try to automatically determine this.” I however have not had good luck with this, including instances with obvious delimiters.

Another solution may be to try auto detect the delimiter

# use the first 2 lines of the file to detect separator
temp_lines = csv_file.readline() + '\n' + csv_file.readline()
dialect = csv.Sniffer().sniff(temp_lines, delimiters=';,')

# remember to go back to the start of the file for the next time it's read
csv_file.seek(0) 

df = pd.read_csv(csv_file, sep=dialect.delimiter)

Summery

It’s all About this issue. Hope all solution helped you a lot. Comment below Your thoughts and your queries. Also, Comment below which solution worked for you? Thank You.

Also, Read