For a given set of text files, I need to find every “\” character and replace it with “\\“. This is a Windows system, and my scripting language options are Javascript, VBScript, or Perl.
These files are largish (~10MB a piece), and there are a good number of them (~15,000). I’ve already come up with the following Javascript:
function EscapeSlashes(inFilePath)
{
var readOnly = 1;
var fso = WScript.CreateObject("Scripting.FileSystemObject");
var outFile = fso.CreateTextFile(inFilePath + "escaped.js", true);
var inFile = fso.OpenTextFile(inFilePath, readOnly);
var currChar;
while(!inFile.AtEndOfStream)
{
currChar = inFile.Read(1);
//check for single backslash
if(currChar != "\\")
{
outFile.Write(currChar);
}
else
{
//write out a double backslash
outFile.Write("\\\\");
}
}
outFile.Close();
inFile.Close();
}
I’m worried that the above might be a bit slow. Is there any way to improve the algorithm? Since I’m replacing one character with two, I don’t think this can be done in-place.
Is there any performance advantage to reading line by line, rather than character by character?
Do Perl or VBScript have any advantages over Javascript in this case?
You can’t do it in place, but generally it’s a good idea to read data in chunks rather than reading a single value at a time. Read a chunk, and then iterate through it. Read another chunk, etc – until the “chunk” is of length 0, or however the call to Read indicates the end of the stream. (On most platforms the call to Read can indicate that rather than you having to call a separate AtEndOfStream function.)
Also, I wouldn’t be surprised if Perl could do this in a single line. Or use
sedif you can 🙂