Run Environment
Operating System: Windows 10
languages:Python 3.x
libraries:standard libraries only (can't use pandas, numpy, etc.)
We are creating a program to mold AWS CloudWatch Logs logs from csv.
When you retrieve csv data from CloudWatch, you get the following csv data:
timestamp, message
000000000000, "START RequestId: AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
"
111111111, "[INFO] 2022-12-09T00:45:15.119ZAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
"
2222222222222, "[INFO] 2022-12-09T00:45:15.120ZAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
"
If you look at this log, you will find the log level date time, RequestId, and log together.
In this condition, if you attach them to Excel, they will stick together and reflect them.
Therefore, in order to prevent these data from sticking together, we are thinking of molding them into the following csv data:
timestamp, message
000000000000,START RequestId:AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
111111111, [INFO], 2022-12-09T00:45:15.119Z, AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
2222222222222, [INFO], 2022-12-09T00:45:15.120Z, AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
I want to mold csv data as above, but it doesn't work because it's lined up vertically.
I would appreciate it if you could tell me how to do it.
As shown in the code below, we did the following:
I thought about it in my own way and wrote it, but I couldn't get the csv data.
I would appreciate it if you could tell me the cause.
Also, I would appreciate it if you could let me know if there is a more efficient way.
import csv
import datetime
# Load open() file
file=open("log_before.csv", "r", encoding="utf-8")
# Loading the csv File
reader=csv.reader(file)
csvalldata=[ ]
for num in reader:
col_0 = num[0]
col_1 = num[1]
splitlist=col_1.split('\t')
splitlist.insert(0,col_0)# Add 0th element
csvalldata.append(splitlist)
print("A. Check List Status", csvalldata)
newdata='"
for i in csvalldata:
for jini:
newdata=newdata+'+j+'','
print("newdata", newdata)
# Specify the read encoding for the open()
now=datetime.datetime.now()#Getting the current time
out_filename = 'log_after_{0:%M%S}.csv'.format(now)
newfile=open(out_filename, "w", encoding="utf-8")
# Writing a csv file
writer=csv.writer(newfile)
writer.writerows (newdata)
print("Create csv file:", out_filename)
libraries:standard libraries only (can't use pandas, numpy, etc.)
Also, I would appreciate it if you could let me know if there is a more efficient way.
In a library-constrained environment, the code is likely to be as muddy as the VBA.If you're going to eventually load it into Excel, I recommend that you process it in Excel from the beginning.Excel has a feature called Power Query for data processing like CSV file shaping.
Choose Data - New Query - Empty Query to launch the Power Query Editor.Select "Advanced Editor" here to enter code.
let
csv=Csv.Document(File.Contents("C:\Path\To\log_before.csv"), [Delimiter=", ", Encoding=65001, QuoteStyle=QuoteStyle.Csv],
headered=Table.PromoteHeaders(csv, [PromoteAllScalars=true]),
split=Table.SplitColumn(headed, "message", Splitter.SplitTextByDelimiter("#(tab), QuoteStyle.None), {"col_0", "col_1", "col_2", "col_3" },
transformed=Table.TransformColumnTypes(splitted, {{"timestamp", Int64.Type}, {"col_0", type text}, {"col_1", type datetime}, {"col_2", type text}, {"col_3", type text}})
in
transformed
Excel maintains how it was loaded, so you can try again from Refresh menu.You can use Folder.Files
to search for files even if the file name changes dynamically.
If the AWSCLI and jq commands are available, you only need 1-3 lines, but for Python+CSV modules (as an example):
※ csv.QUOTE_NONNUMERIC
is specified just in case
import csv
import datetime
import re
now=datetime.datetime.now()
out_filename = 'log_after_{0:%M%S}.csv'.format(now)
with open('log_before.csv', 'r', encoding='utf-8') as file, \
open(out_filename, 'w', encoding='utf-8') as newfile:
reader=csv.reader(file)
writer=csv.writer(newfile,quoting=csv.QUOTE_NONNUMERIC)
for col_0, col_1 in reader:
splitlist = [col_1.trip()] if re.search(r'^START', col_1) else\
[c.trip() for cine.split(r'\s+', col_1, maxsplit=3)]
splitlist.insert(0,col_0)
writer.writerow (splitlist)
print('csv file created:', out_filename)
© 2024 OneMinuteCode. All rights reserved.