How do I change the dtype from object to int64?

Asked 1 years ago, Updated 1 years ago, 443 views

import numpy as np
import pandas aspd
df_maize=pd.read_csv("PSD online data maize.csv")
print(df_maize["Name"])
print(df_maize["Production"])
print(df_maize["Exports"])

# If you look at the printed results, the data with 4 digits or less is int64 and the data with 5 digits or more is object#
0 2013
1 2014
2 2015
3 2016
4 2017
5 2018
6 2019
7 2020
8 2021
9 2022
Name:Name,dtype:int64
0248,453
1249,764
2264,992 3263,613
4259,071
5257,174
660,779 7260670
8272,552
9277,200 Name:Production, dtype:object
022
113
24
361
419
519
612
74
83
920
Name:Exports, dtype:int64

python pandas numpy

2023-02-06 00:11

2 Answers

Remove ,.

...
>>df_maize.loc[:, "Production"].str.replace(", "", "").astype("int64")
0    248453
1    249764
2    264992
3    263613
4    259071
5    257174
6    260779
7    260670
8    272552
9    277200
Name: Production, dtype: int64


2023-02-06 03:20

If you present the contents of the CSV data file you are using, it will be easy to get advice and answers.
If you extract only the relevant columns, wouldn't this be the data?

Name, Production, Exports
2013,"248,453",22
2014,"249,764",13
2015,"264,992",4
2016,"263,613",61
2017,"259,071",19
2018,"257,174",19
2019,"260,779",12
2020,"260,670",4
2021,"272,552",3
2022,"277,200",20

The pandas.read_csv parameter has thousands, which will take care of the load.

thousands —str, optional
Southands separator.

[pandas] Read_csv Usage Summary

How to specify and change numbers and strings
argument default value meaning
You can specify thousands None digit delimiter.For example, ', '.

So this line:

df_maize=pd.read_csv("PSD online data maize.csv")

Why don't we do it this way?

df_maize=pd.read_csv("PSD online data maize.csv", thousands=',')


2023-02-06 06:47

If you have any answers or tips


© 2024 OneMinuteCode. All rights reserved.