Understanding How to Analyze Maillogs, Aggregate them to One Line, and Output them

Asked 1 years ago, Updated 1 years ago, 87 views

environment:CentOS 6.9

Thank you for your help.
It was very helpful that you previously taught me how to use awk to extract mail logs for a specified period of time.

I was thinking a lot about improving it from here, but it didn't work out by myself, so I asked you to give me your opinion again.

Here is a sample email log.
This is when it is successfully transmitted and when it is bounced.
It varies slightly depending on the destination, but it is basically in this format.
This is an example of when I sent it to SoftBank (i.softbank.jp).
By the way, as you can see, I am sending it via AWS.

May 1000:00 ip-172-31-21-151 mail/smtpd[1439]: B670861CA0:client=ip-172-31-38-47.ap-norteast-1.compute.internal[172.31.38.47]
May 1000:00:00 ip-172-31-21-151 mail/cleanup [1477]: B670861CA0: message-id=<[email protected]>
May 1000:00 ip-172-31-21-151 mail/qmgr [1431]: B670861CA0:from=<[email protected]>,size=494,nrcpt=1 (queue active)
May 1000:00 ip-172-31-21-151 mail/smtp[1459]: B670861CA0:to=<[email protected]>,relay=msv.softbank.jp[117.46.9.104]:25,delay=0.8,delays=0.06/3.1/0.6,dsn=2.0.0,status=sent(250message)20180510111628949.MPQS.14641.ebmky105sc.i.softbank.jp@ebmky105sb.mailsv.softbank.jp
May 1000:00:00 ip-172-31-21-151 mail/qmgr [1431]: B670861CA0: removed

May 1000:00 ip-172-31-21-151 mail/smtpd [8874]: 5564561CB8: client=ip-172-31-38-47.ap-northeast-1.compute.internal [172.31.38.47]
May 1000:00 ip-172-31-21-151 mail/cleanup [8877]:5564561CB8:message-id=<QUdsjiAFHF@OzemvwHngZ>
May 1000:00 ip-172-31-21-151 mail/qmgr [8873]:5564561CB8:from=<[email protected]>,size=442,nrcpt=1 (queue active)
May 1000:00 ip-172-31-21-151 mail/smtp [8896]:5564561CB8:to=<[email protected]>,relay=msv.softbank.jp[117.46.7.40]:25,delay=1.1,delays=0.01/1/0.06,dsn=5.0.0,status=bound(host.40.7:550)
May 1000:00:00 ip-172-31-21-151 mail/bounce [8906]: 5564561CB8: sender non-delivery notification: 6B19061CB3
May 1000:00:00 ip-172-31-21-151 mail/qmgr [8873]: 5564561CB8: removed

Based on what you told me before, the output is now as follows.
(Example: extracting Softbank-based to)

#!/bin/bash

filename = 'filename.csv'
nowtime=`date+%s`
updateetime=`date+%s-r$filename`
this year = `date+%Y`
this month = `date+%m`

localstr=`date`+%D%T``
gmstr=`date`+%D%T`-u`
localtime=`date-d`$localstr`+%s`
gmtime=`date-d`$gmstr`+%s`
timediff=$(($localtime-$gmtime))

cat/var/log/maillog | awk-F-v nowtime="$nowtime"-v updatetime="$ updatetime"\
    - v this year = "$ this year" - v this month = "$ this month" \
    -v timediff="$timediff"'{
    m = substr ($1,1,3)
    mon=(index("JanFebMarAprMayJunJulAugSepOctNovDec",m)+2)/3
    year=mon<=this month?this year:this year-1
    day = substr ($1,5,2)
    hh = substr ($1,8,2)
    mm = substr ($1,11,2)
    ss=substr ($1,14,2)

    if(mon<3) {mon+=12;year--}
    epochtime= (365*year+int(year/4)-int(year/100)+int(year/400)\
        + int (306*(mon+1)/10)-428+day-719163)*86400\
        + (hh*3600) + (mm*60) + ss-timediff;
    } updateetime<epochtime&&epochtime<=nowtime' 
    | grep-e'to=<.*@i.softbank.jp>\|to=<.*@s oftbank.ne.jp>\|to=<.* vodafone.ne.jp>' 
    | grep-v'discard' | sed-e's /(250.*)//' 
    |awk'{ if($12=="status=deferred") print $1, $2, $3, $7, $12; else if($16==550) print $1, $2, $3, $7, $12, $16, $23; else print $1, $2, $3, $7, $12, $16}' 
    | sed-e's /to = <//' | sed-e's />, //' 
    | sed-e's/status=//' 
    |awk'{print($1,$2,$3", "$4", "$5", "$6", "$7)}'>$filename

Date and time, destination email address (to), status, error code, error type

Outputs as comma-separated CSV files.

May800:00:00, [email protected], sent,
May 800:01:00, [email protected], bound, 550, DATA
May 800:02:00, [email protected], bound, 550, RCPT

In addition to this, I would like to add the source email address (from).

Extract the queue ID of to under similar conditions, and then extract from,
In the end, I would like to be able to output with from added above.

Here are some specific examples I would like to print:

May800:00:00, [email protected], sent,,,[email protected]
May 800:01:00, [email protected], bound, 550, DATA, [email protected]
May 800:02:00, [email protected], bound, 550, RCPT, [email protected]

Thank you for your cooperation.

bash sh awk

2022-09-30 21:30

1 Answers

Basically, you just need to use the ID of the queue as the key to aggregate it, but

  • The queue and caller are one-to-one, but the destination is one-to-N
  • If the other party does not receive an email due to an error in the 400s, the queue will remain for a long time and appear in the log repeatedly.
  • Queue IDs May Be Reused

Because of this situation, the totalization is rather difficult.You need to create a program of a reasonable length in your language.


2022-09-30 21:30

If you have any answers or tips


© 2024 OneMinuteCode. All rights reserved.