What is the best practice (interface and implementation) for a command line tool
that processes selected files in a directory tree?
I give an example that comes to my mind, but I am looking for a ‘best practice’:
flipcase foo.txt foo2.txt
could process foo.txt and save the result as foo2.txt.
flipcase -rv *.txt
could process all text files in the current directory.
-r or --recursive will include all subdirectories.
-v will print some infos to stdout while processing.
One problem that I see with this example is, that the *.txt argument is
sometimes expanded by the shell (Unix and Vista), so I can’t apply this pattern
when walking sub directories.
I guess the reason is, that on Unix such tools are comined with a call to find,
but this seems not to be common on Windows. It also makes it hard to print a
summary at the end.
Requirements:
- MUST run on Unix, Windows XP, Windows 7 and Mac
- SHOULD follow common conventions on these platforms.
(Yes, I know. But I am looking for a reasonable compromise.
For example it’s Ok to use-instead of/on Windows.) - SHOULD not rely on a separate find command, like grep does.
- MUST work for single files, file patterns and patterns in directory
hierarchies. - SHOULD be build with standard Python libs, e.g.
OptionParserandos.walk. - COULD handle multiple patterns, e.g.
*.txt,*.html.
Other questions on design decisions:
- What should this tool return (status code)?
- Which ctrl-keys should this tool handle, and in what way?
- Should stdin be supported instead of a single file? Configurable or
auto-detect? - Should output redirection be supported? Configurable or auto-detect?
How deal with debug output in this case? - Should the pattern be glob syntax, or a regular expression?
- Is there a common pattern syntax that supports recursion?
Mayberecursive:*.txt
In this case the-roption would not be neccesary. - What is best practice to create backups of modified files?
Option-b, or rather have backups by default and add--no-backupoption - For single files it should be possible to specify a target file name. How?
- What status info should be printed, and hot configure this?
Should it be verbose by default and we allow-qfor quiet?
Or always print a little bit and allow-v(or-vv) to boost this or-qto
shut up completely?
I don’t really expect to get one single right answer, but may be a handful of
thoughts and pointers to good sample projects.
In my experience, the best starting point is to build a tool that follows basic Unix principles — namely, to read from standard input and write to standard output. This allows people to use your tool in a flexible way:
The next feature might be in-place editing:
In verbose mode, the tool should not write to standard output, because that would conflict with the core principles above. It should write to standard error or a user-defined log file.
After that, you add recursive behavior. The direction is less clear-cut here, but I’ll toss out a few ideas. In the typical recursive case, the program’s arguments are probably directories, and the user would need to supply additional options to define various types of filtering behavior (that is, which types of files to process).