Yesterday I asked a question here about a oneliner and mjschultz gave me an answer that I instantly fell in love with 🙂 Awk just destroyed the task at hand, parsing a large logfile (500+ MB) in a matter of seconds. Now I’m trying to port my other oneliners to awk.
This is the one in question:
grep "pop3\[" maillog | grep "User logged in" |
egrep -o '([[:digit:]]{1,3}\.){3}[[:digit:]]{1,3}' | sort -u
I need the list of all unique IP addresses using pop3 to connect to the mail server.
This is an example log entry:
Nov 15 00:49:21 hostname pop3[19418]: login: [10.10.10.10] username plaintext
User logged in
So I find all the lines containing “pop3” and I parse them for the “User logged in” part. Next i use egrep and a regex to match IP addresses and I use sort to filter out the duplicate addresses.
This is what I have so far for my awk version:
awk '/pop3\[.*.User logged in/ {ip[$7]=0} END {for (address in ip)
{ print address} }' maillog
This works perfectly but as always not all log entries are identical, for example sometimes the IP gets moved to the 8th field like here:
Nov 15 10:42:40 hostname pop3[2232]: login: hostname.domain.com [20.20.20.20]
username plaintext User logged in
What would be the best way to catch those entries with awk as well?
As always thanks for all the great responses in advance, you’ve taught me so much already 🙂
AWK code
just match your ip format … be careful that there are no other formats …
running at ideone