In my pursuit of finding a particularly persistent memory leak in C++ code, I’ve resolved to writing all allocations to a log file in the following format:
<alloc|free> <address> <size> <UNIQUE-ID> <file> <line number>
This gives me, for example:
alloc 232108 60 405766 file1.cpp (3572)
free 232128 60 405766
alloc 232108 60 405767 file1.cpp (3572)
free 232128 60 405767
alloc 7a3620 12516 405768 file2.cpp (11435)
free 7a3640 12516 405768
alloc 2306c8 256 405769 file3.cpp (3646)
alloc 746160 6144 405770 file3.cpp (20462)
alloc 6f3528 2048 405771 file4.h (153)
alloc 6aca50 128 405772 file4.h (153)
alloc 632ec8 128 405773 file4.h (153)
alloc 732ff0 128 405774 file4.h (153)
free 746180 6144 405770
free 632ee8 128 405773
alloc 6a7610 2972 405778 this_alloc_has_no_counterpart.cpp (123)
free 6aca70 128 405772
free 733010 128 405774
free 6f3548 2048 405771
alloc 6a7610 2972 405775 file3.cpp (18043)
alloc 7a3620 12316 405776 file5.cpp (474)
alloc 631e00 256 405777 file3.cpp (18059)
free 7a3640 12316 405776
free 6a7630 2972 405775
free 631e20 256 405777
free 2306e8 256 405769
I’m trying to match every alloc to a free and leave just the allocs without a free counterpart, for example, allocation number 405778.
What I can come up with is the following shell script:
#!/bin/sh
grep "^alloc" test.txt | while read line
do
alloc_nr=`echo $line | awk '{ print $4 }'` # arg4 = allocation number
echo "Processing $alloc_nr"
sed -i "/ ${alloc_nr}/{//d}" test.txt
done
As you may have guessed, this is painstakingly slow (ie. 2 loops per second) on a 25MB file with about 144000 allocs, since I use sed in a horribly inefficient way.
It’d be very much appreciated if someone could give me a nudge in the right direction on how to achieve this without it taking three hours.
Seems you want only the IDs and not the whole line:
awk '{print $4}'print only the ID column.sortsort the column.uniq -udisplay only the unique IDs.Edit:
Pipe to
grep -f - fileto match the whole line, no need to loop:grep -fmatches patterns from a file and-means usestdin.