with open("test.json", 'r', encoding="UTF8") as file:
lines = file.readlines()
json_data = [json.loads(ln.replace("\n", ""))['data'] for ln in lines]
df = pd.DataFrame(json_data)
The structure of the df above is shown in the following picture.
To count how many rows each of the values of the itemid column is in the data frame of the above form
item_count=df['itemid'].value_counts(normalize=False)
I printed it out with the corresponding code, so it came out as shown in the picture below.
I want to create a data table where the srv_id, itemid, clock, and value information exist and the value of min, max, sum, avg (average), num (itemid) are repeated without duplicating the itemid.
srv_id, itemid, clock, min_value, max_value, sum_value, age_value, num
I'd like to create a new data table with 8 columns.
python python3.7 pandas numpy dataframe
You can do it with groupby.
df.groupby('itemid')
This divides the df based on the 'itemid'
column. For each group, the question is how to make a representative value for each column. If you only think about the things that are clearly defined in the question, you can do the following.
df.groupby('itemid').agg(
max_value=('value', 'max'),
min_value=('value', 'min'),
sum_value=('value', 'sum'),
avg_value=('value', 'mean'))
ref : https://stackoverflow.com/a/57338929/100093
© 2024 OneMinuteCode. All rights reserved.