Python reads multiple files (1000 files) of csv, extracts certain columns according to the conditions, and
I would like to print it to a new csv file.
file1: [id, time, value][1,3.5,6][2,2.0,4][3,2.6,8]...[30,15.5,50]
If I had only one file, I could have done what I wanted to do with the following script, but how do I change the script to do it with 1000 files?
import pandas as pd
df = pd.read_csv("list1.csv")
df = (df[df["time"]<0.5])
df.to_csv("list1_0.5h.csv")
I apologize for the rudimentary content, but I would appreciate it if you could teach me.
Thank you for your cooperation.
You can use pd.concat as a way to combine data frames with the same column.
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.concat.html
Assuming that the file names are defined in a unified manner and are named list1.csv, list2.csv, ..., list100.csv, you can pass the corresponding ID to a string representing the file name.This can be done in list composition.
import pandas as pd
df = pd.concat(
[pd.read_csv("list{}.csv".format(i+1))) for i in range(100)])
df = df [df["time"] <0.5]
df.to_csv("list1to100_0.5h.csv")
import sys
importos
import pandas aspd
# Destination folder specification (input/output can be specified separately: both current folders here)
infolder='./'
outfolder='./'
# Information on how to assemble the target file name (for specific string + numeric format)
fprefix='list'# string at the beginning of the filename
fsuffixFirst=1# The first number in the filename
fsuffixMaxPlus1 = 1001# filename plus +1
fsuffixStep=1# Increasing number interval in filename
# Loop 1000 files
for fsuffix in range (fsuffixFirst, fsuffixMaxPlus1, fsuffixStep):
basefname=fprefix+str(fsuffix)#Assemble file name only
inputfile=infolder+basefname+'.csv'#pathname creation
ifos.path.exists(inputfile): #Check if a file exists before processing
df = pd.read_csv(inputfile)
df = (df[df["time"]<0.5])
df.to_csv(outfolder+basefname+'_0.5h.csv', index=False)
As others have pointed out, if you think of creating an output file in one input file (as shown in the example), why don't you simply use glob?
Suppose you have 1000 csv files similar to "list1.csv" in the same directory.
I think you can do the following.
import pandas as pd
import glob
FNs=glob.glob("list*.csv")
for fn in FNs:
df = pd.read_csv(fn)
df = (df[df["time"]<0.5])
out_fn = fn.split(".");
df.to_csv(out_fn[0]+"_0.5h."+out_fn[1])
The glob finds a particular file and puts it in the list.
© 2024 OneMinuteCode. All rights reserved.