I want to create DF using json_normalize, but I get a key error.

"When I try to read the JSON file into DF as shown below, I get ""keyerror""."
Why?

The original JSON file is as follows:

[
    {
        "creat_at": "2020-04-26T 12:55:58 + 0900",
        "pay_id": "E86F0CD0B346",
        "pay": {
            "a"—1.32,
            "b": [
                0,
                0
            ],
            "c": "xxxx",
            "d": "0709",
            "e": "sssss",
            "f": 290,
            "g" : -55,
            "h"—23.3
        },
        "timestamp": "2020-03-16T09:18:39.878Z",
        "updated_at": "2020-04-26T 12:55:58+0900"
    },
    {
        "creat_at": "2020-04-26T 12:55:58 + 0900",
        "pay_id": "E86F0CD0B346",
        "pay": {
            "a"—1.32,
            "b": [
                0,
                0
            ],
            "c": "xxxx",
            "d": "0809",
            "e": "sssss",
            "f": 290,
            "g" : -55,
            "h"—23.3
        },
        "timestamp": "2020-03-16T09:18:39.878Z",
        "updated_at": "2020-04-26T 12:55:58+0900"
    },
・・・
・・・

json_file="./export.json"
df0 = pd.read_json(json_file)
df0["pay".iloc[1]

Results

{'a':1.4,
 'b': [29,0],
 'c': 'xxxx',
 US>'d': '00070',
 'e': 'sssss',
 US>'f': 236,
 'g': -95,
 US>'h':21.7}

We are trying to make the above data DF by processing the following.

from pandas.io.json import json_normalize
df_items=json_normalize(df0.to_dict("records"), "pay", "pay_id")
df_items.sort_values("item_id")

The results I would like to get from the above results are as follows, but key errors occur.Why?

|item_id|a|b|c|d|e|f|g|h|
|---    |---|--- |----|--- |--- |---|---|--- |
| 1 | 1.4 | 29, 0 | xxxx | 0070 | ssss | 236 | -95 | 21.7 |

python python3 pandas

2022-09-30 21:49

1 Answers

There is no information about the key item_id in the program or data provided, and there is no process to configure it later.

For example, the DataFrame column name ['0', 'pay_id'] that can be done by json_normalize() processing of the question article is ['0', 'pay_id'], where item_id does not exist, so df_items.sort_values("/code>

Why don't you check print() for each step of the processing to see what kind of data is generated by each processing?print(df0.to_dict("records") or print(df_items)?

To be as simple as possible: item_id uses the index number of the list as it is.

import pandas as pd
from pandas import json_normalize

json_file="./export.json"
df0 = pd.read_json(json_file)
paylist=df0["pay"]#'pay' Create a list of dictionary data

df_items=json_normalize(paylist)#'pay' list of dictionary data normalize
df_items.index.name = 'item_id' # Set index name to 'item_id' for DataFrame
df_items.sort_values('item_id')

print(df_items)



		
		
			

				

					
				

				
					2022-09-30 21:49

			
			If you have any answers or tips



		

	
		Popular Tags
	
	python x 4647
android x 1593
java x 1494
javascript x 1427
c x 927
c++ x 878
ruby-on-rails x 696
php x 692
python3 x 685
html x 656
	


	
		Popular Questions
	
	
	1254 When building Fast API+Uvicorn environment with PyInstaller, console=False results in an error

	789 M2 Mac fails to install rbenv install 3.1.3 due to errors

	707 I'm a beginner at Flask. The requested URL was not found on the server. If you entered the URL manually please check your spelling and try again.

	773 Error in x, y, and format string must not be None

	869 Uncaught (inpromise) Error on Electron: An object could not be cloned