Please tell me how to combine them according to the Python Pandas data frame conditions and calculate specific values.

with open("test.json", 'r', encoding="UTF8") as file:
    lines = file.readlines()
    json_data = [json.loads(ln.replace("\n", ""))['data'] for ln in lines]
    df = pd.DataFrame(json_data)

The structure of the df above is shown in the following picture.

To count how many rows each of the values of the itemid column is in the data frame of the above form

item_count=df['itemid'].value_counts(normalize=False)

I printed it out with the corresponding code, so it came out as shown in the picture below.

I want to create a data table where the srv_id, itemid, clock, and value information exist and the value of min, max, sum, avg (average), num (itemid) are repeated without duplicating the itemid.

srv_id, itemid, clock, min_value, max_value, sum_value, age_value, num

I'd like to create a new data table with 8 columns.

python python3.7 pandas numpy dataframe

2022-09-20 21:55

1 Answers

You can do it with groupby.

df.groupby('itemid')

This divides the df based on the 'itemid' column. For each group, the question is how to make a representative value for each column. If you only think about the things that are clearly defined in the question, you can do the following.

df.groupby('itemid').agg(
    max_value=('value', 'max'),
    min_value=('value', 'min'),
    sum_value=('value', 'sum'),
    avg_value=('value', 'mean'))

ref : https://stackoverflow.com/a/57338929/100093

2022-09-20 21:55

Please tell me how to combine them according to the Python Pandas data frame conditions and calculate specific values.

1 Answers

If you have any answers or tips