Data cleaning using Pandas is not going well.

Asked 2 years ago, Updated 2 years ago, 36 views

After loading the csv, I want to use pandas to delete the comma (,) in the specified column, but it doesn't work.
Executed the following code:

import pandas as pd
import numpy as np
import matplotlib.pyplot asplt
df=pd.read_csv('csv/gaku-mg1712Ver2.csv', encoding='shift_jis')
df["Gross Domestic Product"] = df["Gross Domestic Product"].str.replace(", ", "")
import pandas as pd
import numpy as np
import matplotlib.pyplot asplt
df=pd.read_csv('csv/gaku-mg1712Ver2.csv', encoding='shift_jis')
df["Gross Domestic Product"] = pd.to_numeric(df["Gross Domestic Product"])#Number Conversion

There were no errors in the former, but when I checked the data in the editor, it seemed to be the same.
The latter received the following error:

ValueError: Unable to parse string "120,801.2"
ValueError: Unable to parse string "120,801.2" at position 0

img is csv data frame

This is the csv data frame

python python3 pandas

2022-09-29 20:25

1 Answers

In the first place, I think you can remove the digit separator by specifying thousands=', ' when loading.

import pandas as pd
url='http://www.esri.cao.go.jp/jp/sna/data/data_list/sokuhou/files/2017/qe173_2/__icsFiles/afieldfile/2017/12/07/gaku-mk1732.csv'
df=pd.read_csv(url, encoding='shift_jis', skiprows=7, header=None, thousands=',')
print(df)


2022-09-29 20:25

If you have any answers or tips


© 2024 OneMinuteCode. All rights reserved.