Answer I want to change:
I'd like to make it so that only the product id and the number (the number has been added).
22 6
24 10
31 3
raw csv data
clientID product ID product information number
324 24. Clothes. 4
531. 22. Refrigerator. 3
432. 24. Clothes. 3
433.24 Clothes.3
434.31. Refrigerator.3
435. 22. Refrigerator. 3
If it were me...
Let's use aws.
Upload the csv file to S3.
Set up AURORA (MYSQL-compatible DBMS) using RDS.
We will create the aws Lambda function in Python or node, handle the csv file stored in S3, and migrate it to Aurora DB.
It's saved in rdbms, so you can use SQL to query it however you want.
Well, if you don't use AWS, I'll use the SQLITE engine built into Python.
If it's a one-time job...It is also convenient to use PANDAS.
Look at the example below and learn.
import pandas as pd
#data.csv
'''
clientID, product ID, product information, number of units
324,24,Clothes,4
531,22,Refrigerator,3
432, 24, clothes, 3
433,24, clothes,3
434,31 Refrigerator,3
435,22, refrigerator,3
'''
df = pd.io.parsers.read_csv("data.csv")
df.groupby("Product ID")["Number"].sum()
'''
Product ID
22 6
24 10
31 3
Name: Count, dtype: int64
'''
574 Who developed the "avformat-59.dll" that comes with FFmpeg?
572 rails db:create error: Could not find mysql2-0.5.4 in any of the sources
617 Uncaught (inpromise) Error on Electron: An object could not be cloned
912 When building Fast API+Uvicorn environment with PyInstaller, console=False results in an error
© 2024 OneMinuteCode. All rights reserved.