Get Python csv file and get maximum, variance with numpy

import numpy as np
import matplotlib.pyplot as plt
import csv

data_file_name = "One year temperature in Seoul.csv"
f = open(data_file_name)
temp_data = csv.reader(f)

temp_month = [[], [], [], [], [], [], [], [], [], [], [], []]
for row in temp_data:
    if row[4] != '':
        m = int(row[0].split('-')[1])-1
        temp_month[m].append(float(row[4]))

I'm just doing it, but I don't know what to do...ㅜ Please let me know

python csv numpy

2022-09-20 22:18

1 Answers

I didn't know what format of data was in the csv file, so I gave an example arbitrarily. Wouldn't it be similar?

Is there a reason why I have to use numpy instead of pandas? Like csv, tabular data is much more convenient to manipulate with pandas rather than numpy.

import pandas as pd

"""
Assume that the Seoul_temperature.csv file looks like this:

    date   | celsius
---------------
2020-01-01 | -17
2020-01-02 | -15
.
.
.
2020-05-23 | 14.5
2020-05-24 | 15.3
"""

# Step 1. Read CSV
df = pd.read_csv("./Seoul_temperature.csv") # Replace with the path where the TODOcsv file is located!

# Step 2. Switch date string to datetime object 
df['date'] = df['date'].apply(lambda x: pd.to_datetime(x, format='%Y-%m-%d'))

# Step 3. To find the highest monthly temperature, create a new "year-month" tag
df['ym'] = df['date'].apply(lambda x: x.strftime("%Y-%m"))

print ("=== Daily Data ===")
print(df)

# Step 4. Find the variance of daily temperature
variation = df['celsius'].var()
print("===Daily data variance===")
print(variation)

# Step 5. Create a new monthly_df with the highest temperature per month and print it
monthly_df = df.groupby('ym').agg({'celsius': 'max'})
print("===Monthly maximum temperature ===")
print(Monthly_df) # The highest monthly temperature is organized and printed.

2022-09-20 22:18

If you have any answers or tips

Popular Tags

python x 4647

android x 1593

java x 1494

javascript x 1427

c x 927

c++ x 878

ruby-on-rails x 696

php x 692

python3 x 685

html x 656