I read r_file.txt one line at a time into the w_file.csv file (CSV file).
The purpose has been realized, but we need to shorten the code and improve it.
Please give me some advice.
r_file.txt↓
Tue Nov 11:00:00 JST 2022
12 aaaa bb cc:cc:cc:ccddd
34eeeffgg:gg:gg:gghhhh
Tue Nov 11:00:05 JST 2022
78ii jjjkk:kk:kk:kklll
99mmm nnnoo:ooo:oooppp
Tue Nov 11:00:00:10 JST 2022
12qqqrrrsss:ss:ss:sstt
34 uuu vvv www:ww:wwzzz
w_file.csv↓
Tue, Nov, 11, 00:00:00, 12, aaa, bbb, cc:cc:cc:cc, ddd
Tue, Nov, 11, 00:00:00, 34,eee,fff,gg:gg:gg:gg,hhh
Tue, Nov, 11, 00:00:05, 78, iii, jjj, kk:kk:kk:kk,ll
Tue, Nov, 11, 00:00:05, 99, mmm, nnn, oo:oo:oo:oo, ppp
Tue, Nov, 11, 00:00:10, 12, qqq, rrr, ss:ss:ss:ss, ttt
Tue, Nov, 11, 00:00:10,34, uuu, vvv, ww:ww:ww:ww,zzz
#test.py
import csv
read_file="r_file.txt"
write_file="w_file.csv"
with open(write_file, "w", encoding="utf_8", newline="")asw:
writer=csv.writer(w)
with open(read_file, "r", encoding="utf_8") asf:
line=f.readline()
while line:
ifline.startswith("Tue"):
# Remove JST 2022
data=line.split() [:4]
# Determine if column start is a number
if line[0]>='0' and line[0]<='9':
elem=line.split()
# In order to add the year and time to the left, the following actions are carried out.
csv_data=data+elem
writer.writerow(csv_data)
line=f.readline()
I think it's better to fix the alphabet on the day of the week, so I'll decide whether it's a number or a new line.
import csv
import string
importio
text = ''
Tue Nov 11:00:00 JST 2022
12 aaaa bb cc:cc:cc:ccddd
34eeeffgg:gg:gg:gghhhh
Tue Nov 11:00:05 JST 2022
78ii jjjkk:kk:kk:kklll
99mmm nnnoo:ooo:oooppp
Tue Nov 11:00:00:10 JST 2022
12qqqrrrsss:ss:ss:sstt
34 uuu vvv www:ww:wwzzz
'''
def func(fp):
for ln in fp:
ifln[0]instring.digits:
field data + ln.split()
elifln[0]!='\n':
data=ln.split() [:4]
# with io.StringIO(text) as fp, open('some.csv', 'w', newline=') ascsvfile:
with(io.StringIO(text) as fp,
io.StringIO()ascsvfile):
writer=csv.writer(csvfile)
writer.writerows (func(fp))
res=csvfile.getvalue()
print(res)
# Tue, Nov, 11, 00:00:00, 12, aaa, bbb, cc:cc:cc:cc, ddd
# Tue, Nov, 11, 00:00:00, 34,eee,fff,gg:gg:gg:gg,hhh
# Tue, Nov, 11, 00:00:05, 78, iii, jjj, kk:kk:kk:kk,ll
# Tue, Nov, 11, 00:00:05, 99, mmm, nnn, oo:oo:oo:oo, ppp
# Tue, Nov, 11, 00:00:10, 12, qqq, rrr, ss:ss:ss:ss, ttt
# Tue, Nov, 11, 00:00:10,34, uuu, vvv, ww:ww:ww:ww,zzz
about def func(fp)
For example,
for the following code:
If there are 100 lines and 100 lines and 100 lst
are created and printed to create one item per line.
If it's a million lines, one million lines and a million lists
If it's 10 billion lines...
with open(fname) as fp:
lines=fp.readlines()
lst=processlines(lines)
with open('some.csv', 'w', newline=')ascsvfile:
writer=csv.writer(csvfile)
writer.writerows(lst)
The following code only requires the list lst
to be huge
(I will process the data while reading the file)
with open(fname) as fp:
lst = [ ]
for line in fp:
lst.append(process(line))
# abbreviation
writer.writerows(lst)
If it's the first code, you don't need a huge list.
When .writerows()
tries to write the next line, the control shifts within func()
and returns the next data with yield
and writes the value within .writerows()
.
The processing proceeds like a bucket relay.
There is also a discussion about how far this kind of question is OK, so please refer to
(For the time being, I think it's OK.only)
© 2025 OneMinuteCode. All rights reserved.