Question: What PowerShell regex pattern will return an output like Bash’s string command?
I found an article on gc and Select-String: Episode #137: Free-base64-ing. http://blog.commandlinekungfu.com/2011/03/episode-137-free-base64-ing.html
I tried a number of regex patterns from a previous question: Regular Expression for alphanumeric and underscores. Regular Expression for alphanumeric and underscores
If I run in Bash: strings –all myfile.bin
Results: 52939 lines of character strings.
gc .\myfile.bin | Select-String -AllMatches “^[a-zA-Z0-9_]*$”
Results: a number of blank lines.
gc .\myfile.bin | Select-String -AllMatches “^\w*$”
Results: 9 lines of characters and a number of blank lines.
gc .\myfile.bin | Select-String -AllMatches “^\w*$”
Results: 9 lines of characters.
gc .\myfile.bin | Select-String -AllMatches “[A-Za-z0-9_]”
Results: Pretty much the entire file, unprintable characters and all.
gc .\myfile.bin | Select-String -AllMatches “^[\p{L} \p{Nd}_]+$”
Results: 20 lines of characters.
So what’s the regex trick that I am missing?
As mentioned, the lack of line breaks will prevent RegEx from working. Microsoft Sysinternals’ strings utility is a good solution.
If you need a native PowerShell solution, ping me. I wrote a Get-Strings cmdlet in C# that does ASCII (UTF8) and Unicode (UTF16) string extraction from binaries. It is not as fast as Sysinternals, but does have the advantage of putting the output into the PowerShell pipeline.