AWS CloudWatch Logs cannot be molded

Run Environment
Operating System: Windows 10 languages:Python 3.x
libraries:standard libraries only (can't use pandas, numpy, etc.)

What programs are you creating

We are creating a program to mold AWS CloudWatch Logs logs from csv.
When you retrieve csv data from CloudWatch, you get the following csv data:

timestamp, message
000000000000, "START RequestId: AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
"
111111111, "[INFO] 2022-12-09T00:45:15.119ZAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
"
2222222222222, "[INFO] 2022-12-09T00:45:15.120ZAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
"

If you look at this log, you will find the log level date time, RequestId, and log together.
In this condition, if you attach them to Excel, they will stick together and reflect them.
Therefore, in order to prevent these data from sticking together, we are thinking of molding them into the following csv data:

timestamp, message
000000000000,START RequestId:AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
111111111, [INFO], 2022-12-09T00:45:15.119Z, AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
2222222222222, [INFO], 2022-12-09T00:45:15.120Z, AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA

What do you want to do

I want to mold csv data as above, but it doesn't work because it's lined up vertically.
I would appreciate it if you could tell me how to do it.

Tried

As shown in the code below, we did the following:

I thought about it in my own way and wrote it, but I couldn't get the csv data.
I would appreciate it if you could tell me the cause.

Also, I would appreciate it if you could let me know if there is a more efficient way.

import csv
import datetime

# Load open() file
file=open("log_before.csv", "r", encoding="utf-8") 

# Loading the csv File
reader=csv.reader(file)

csvalldata=[ ]
for num in reader:
    col_0 = num[0] 
    col_1 = num[1]
    splitlist=col_1.split('\t')
    splitlist.insert(0,col_0)# Add 0th element
    csvalldata.append(splitlist)

print("A. Check List Status", csvalldata)          
newdata='"   
for i in csvalldata:
  for jini:
    newdata=newdata+'+j+'','
    print("newdata", newdata)

# Specify the read encoding for the open()
now=datetime.datetime.now()#Getting the current time
out_filename = 'log_after_{0:%M%S}.csv'.format(now) 
newfile=open(out_filename, "w", encoding="utf-8") 

# Writing a csv file
writer=csv.writer(newfile)
writer.writerows (newdata)
print("Create csv file:", out_filename)

python aws csv excel

2022-12-12 22:52

2 Answers

libraries:standard libraries only (can't use pandas, numpy, etc.)

Also, I would appreciate it if you could let me know if there is a more efficient way.

In a library-constrained environment, the code is likely to be as muddy as the VBA.If you're going to eventually load it into Excel, I recommend that you process it in Excel from the beginning.Excel has a feature called Power Query for data processing like CSV file shaping.
Choose Data - New Query - Empty Query to launch the Power Query Editor.Select "Advanced Editor" here to enter code.

let
    csv=Csv.Document(File.Contents("C:\Path\To\log_before.csv"), [Delimiter=", ", Encoding=65001, QuoteStyle=QuoteStyle.Csv],
    headered=Table.PromoteHeaders(csv, [PromoteAllScalars=true]),
    split=Table.SplitColumn(headed, "message", Splitter.SplitTextByDelimiter("#(tab), QuoteStyle.None), {"col_0", "col_1", "col_2", "col_3" },
    transformed=Table.TransformColumnTypes(splitted, {{"timestamp", Int64.Type}, {"col_0", type text}, {"col_1", type datetime}, {"col_2", type text}, {"col_3", type text}})
in
    transformed

Excel maintains how it was loaded, so you can try again from Refresh menu.You can use Folder.Files to search for files even if the file name changes dynamically.

2022-12-12 23:19

If the AWSCLI and jq commands are available, you only need 1-3 lines, but for Python+CSV modules (as an example):
※ csv.QUOTE_NONNUMERIC is specified just in case

import csv
import datetime
import re

now=datetime.datetime.now()
out_filename = 'log_after_{0:%M%S}.csv'.format(now) 

with open('log_before.csv', 'r', encoding='utf-8') as file, \
     open(out_filename, 'w', encoding='utf-8') as newfile:
    reader=csv.reader(file)
    writer=csv.writer(newfile,quoting=csv.QUOTE_NONNUMERIC)
    for col_0, col_1 in reader:
        splitlist = [col_1.trip()] if re.search(r'^START', col_1) else\
                    [c.trip() for cine.split(r'\s+', col_1, maxsplit=3)]
        splitlist.insert(0,col_0)
        writer.writerow (splitlist)

print('csv file created:', out_filename)

2022-12-12 23:29

If you have any answers or tips

Popular Tags

python x 4647

android x 1593

java x 1494

javascript x 1427

c x 927

c++ x 878

ruby-on-rails x 696

php x 692

python3 x 685

html x 656