What language should I use for file and string manipulation?
This might seem objective, but really isn’t I think. There’s lot to say about this. For example I can see clearly that for most usages Perl would be a more obvious candidate than Java. I need to do this quite often and at this time I use C# for it, but I would like a more scriptlike language to do this.
I can imagine Perl would be a candidate for it, but I would like to do it in PowerShell since PowerShell can access the .NET library (easy). Or is Python a better candidate for it? If I have to learn a new language, Python is certainly one on my list, rather than Perl.
What I want to do for example, is to read a file, make some changes and save it again. E.g.: open it, number all lines (say with 3 digits) and close it.
Any example, in any language, would be welcome, but the shorter the better. It is utility scripting I’m after here, not OO, TDDeveloped, unit-tested stuff of course.
What I would very much like to see is something as (pseudocode here):
open foobar.as f
foreach line in f.lines
line.addBefore(currenIteratorCounter.format('ddd') + '. ')
close f
So:
bar.txt
Frank Zappa
Cowboy Henk
Tom Waits
numberLines bar.txt
bar.txt
001. Frank Zappa
002. Cowboy Henk
003. Tom Waits
UPDATE:
The Perl and Python examples here are great, and definitely in the line of what I was hoping and expecting. But aren’t there any PowerShell guys out there?
This is actually pretty easy in PowerShell:
What I’m doing here is getting the contents of the file, this will return a
String[], over which I iterate withForEach-Objectand apply a format string using the-foperator. The result just drops out of the pipeline as anotherString[]which can be redirected to a file if needed.You can shorten it a little by using aliases:
but I won’t recommend that for a function definition.
You way want to consider using two passes, though and constructing the format string on the fly to accommodate for larger numbers of lines. If there are 1500 lines
{0:000}it won’t be sufficient anymore to get neatly aligned output.As for which language is best for such tasks, you might look at factors such as
In the light of the last point you might even be better off using
cmdfor this task. The code is similarly pretty simple:That assumes, of course, that it has to run somewhere else than your own machine. If not, then use whatever fits your needs 🙂