Request to check python CSV file creation code

Asked 1 years ago, Updated 1 years ago, 322 views

I read r_file.txt one line at a time into the w_file.csv file (CSV file).
The purpose has been realized, but we need to shorten the code and improve it. Please give me some advice.

r_file.txt↓

Tue Nov 11:00:00 JST 2022
12 aaaa bb cc:cc:cc:ccddd
34eeeffgg:gg:gg:gghhhh

Tue Nov 11:00:05 JST 2022
78ii jjjkk:kk:kk:kklll
99mmm nnnoo:ooo:oooppp

Tue Nov 11:00:00:10 JST 2022
12qqqrrrsss:ss:ss:sstt
34 uuu vvv www:ww:wwzzz

w_file.csv↓

Tue, Nov, 11, 00:00:00, 12, aaa, bbb, cc:cc:cc:cc, ddd
Tue, Nov, 11, 00:00:00, 34,eee,fff,gg:gg:gg:gg,hhh
Tue, Nov, 11, 00:00:05, 78, iii, jjj, kk:kk:kk:kk,ll
Tue, Nov, 11, 00:00:05, 99, mmm, nnn, oo:oo:oo:oo, ppp
Tue, Nov, 11, 00:00:10, 12, qqq, rrr, ss:ss:ss:ss, ttt
Tue, Nov, 11, 00:00:10,34, uuu, vvv, ww:ww:ww:ww,zzz
#test.py

import csv

read_file="r_file.txt"
write_file="w_file.csv"


with open(write_file, "w", encoding="utf_8", newline="")asw:
    writer=csv.writer(w)
    
    with open(read_file, "r", encoding="utf_8") asf:
        line=f.readline()
        
        while line:
            ifline.startswith("Tue"):
                
                # Remove JST 2022
                data=line.split() [:4]
            
            # Determine if column start is a number
            if line[0]>='0' and line[0]<='9':
                elem=line.split()
                
                # In order to add the year and time to the left, the following actions are carried out.
                csv_data=data+elem
                
                writer.writerow(csv_data)
                
            line=f.readline()

python python3

2022-12-13 22:35

1 Answers

I think it's better to fix the alphabet on the day of the week, so I'll decide whether it's a number or a new line.

import csv
import string
importio

text = ''
Tue Nov 11:00:00 JST 2022
12 aaaa bb cc:cc:cc:ccddd
34eeeffgg:gg:gg:gghhhh

Tue Nov 11:00:05 JST 2022
78ii jjjkk:kk:kk:kklll
99mmm nnnoo:ooo:oooppp

Tue Nov 11:00:00:10 JST 2022
12qqqrrrsss:ss:ss:sstt
34 uuu vvv www:ww:wwzzz
'''

def func(fp):
    for ln in fp:
        ifln[0]instring.digits:
            field data + ln.split()
        elifln[0]!='\n':
            data=ln.split() [:4]

# with io.StringIO(text) as fp, open('some.csv', 'w', newline=') ascsvfile:
with(io.StringIO(text) as fp,
      io.StringIO()ascsvfile):
    writer=csv.writer(csvfile)
    writer.writerows (func(fp))

    res=csvfile.getvalue()
print(res)
# Tue, Nov, 11, 00:00:00, 12, aaa, bbb, cc:cc:cc:cc, ddd
# Tue, Nov, 11, 00:00:00, 34,eee,fff,gg:gg:gg:gg,hhh
# Tue, Nov, 11, 00:00:05, 78, iii, jjj, kk:kk:kk:kk,ll
# Tue, Nov, 11, 00:00:05, 99, mmm, nnn, oo:oo:oo:oo, ppp
# Tue, Nov, 11, 00:00:10, 12, qqq, rrr, ss:ss:ss:ss, ttt
# Tue, Nov, 11, 00:00:10,34, uuu, vvv, ww:ww:ww:ww,zzz

about def func(fp)

For example,
for the following code: If there are 100 lines and 100 lines and 100 lst are created and printed to create one item per line. If it's a million lines, one million lines and a million lists
If it's 10 billion lines...

with open(fname) as fp:
    lines=fp.readlines()
lst=processlines(lines)
with open('some.csv', 'w', newline=')ascsvfile:
    writer=csv.writer(csvfile)
    writer.writerows(lst)

The following code only requires the list lst to be huge
(I will process the data while reading the file)

with open(fname) as fp:
    lst = [ ]
    for line in fp:
        lst.append(process(line))

    # abbreviation
    writer.writerows(lst)

If it's the first code, you don't need a huge list.
When .writerows() tries to write the next line, the control shifts within func() and returns the next data with yield and writes the value within .writerows().

The processing proceeds like a bucket relay.

There is also a discussion about how far this kind of question is OK, so please refer to
(For the time being, I think it's OK.only)


2022-12-14 02:55

If you have any answers or tips


© 2024 OneMinuteCode. All rights reserved.