While going through the ‘Intermediate Perl‘ book I noticed a section on Schwartzian Transforms and tried the example in the exercise (9.9.2) but noticed that multiple runs resulted in the transform taking more time then the normal sort. The code here performs a simple sort of the files in the windows\system32 directory based on file size –
#!/usr/bin/perl use strict; use warnings; use Benchmark; my $time = timethese( 10, { testA => sub { map $_->[0], sort {$a->[1] <=> $b->[1]} map [$_, -s $_], glob 'C:\\Windows\\System32\\*'; }, testB => sub { sort { -s $a <=> -s $b } glob 'C:\\Windows\\System32\\*'; }, } );
The output is –
Benchmark: timing 10 iterations of testA, testB... testA: 11 wallclock secs ( 1.89 usr + 8.38 sys = 10.27 CPU) @ 0.97/s (n=10) testB: 5 wallclock secs ( 0.89 usr + 4.11 sys = 5.00 CPU) @ 2.00/s (n=10)
My understanding was that since the file operation (-s) needs to be repeated over and over in the testB case it should run a lot slower than testA. The output though deviates from that observation. What am I missing here?
For me, the output looks a bit different:
Benchmarking this with a decent value of iterations (I chose 100,000), I get this:
A look at the code tells me that those two subs probably spend most of their time globbing the files, so I did this:
And get:
Something smells fishy here, doesn’t it?
So, let’s take a look at the docs:
perldoc -f sort
Aha! So let’s try it again:
This gives me:
So. To answer your questions: A Schwartzian transform will help you whenever you use it in a meaningful way. Benchmarking will show this when you benchmark in a meaningful way.