Imagine that you are preparing for an in-depth technical interview and you are asked to rate your expertise in shell scripting (hypothetically on a scale of one to ten). Then look at the following shell command line example and answer the questions: What does this do? and Why?
unset foo; echo bar | read foo; echo "$foo"
What level of expertise would you map to correctly answer this question for the general case (not merely for one or another, specific, version of the shell)?
Now imagine that you’re given the following example:
cat "$SOMELIST_OF_HOSTS" | while read host; do ssh $host "$some_cmd"; done
… and the interviewer explains that this command “doesn’t work” and that it seems to only execute the ssh command on a few of the hosts listed in the (large) file (something like on in every few hundred hostnames, seemingly scattered from among the list). Naturally he or she asks: Why is it doing that? and How might you fix it?
Then rate the level of expertise to which you would map someone who can answer those questions correctly.
Note that the first example is dependent on the shell. The pipe is an IPC (inter-process communications) operator and the shell can implement that by creating a subshell on either side of the pipe. (Technically I suppose some shell could even evaluate both sides in separate sub-processes).
The
readcommand is a built-in (must, inherently be so). So, in shells such asbashand the classic Bourne shell derivatives the suprocess (subshell) is on the right of the pipe (reading from the current shell) and that process ends after itsread(at the semicolon in this example). Korn shell (at least as far back as ’93) andzshput their subshell on the other side of the pipe and are reading data from those into the current process.That’s the point of the interview question.
The point of my question here is to look for some consensus or metric for how highly to rate this level of question. It’s not a matter of trivia because it does affect real world scripts and portability for shell scripting and it relies upon fundamental understanding of the underlying UNIX and shell semantics (IPC, pipes, and subprocesses/subshells).
The second example is similar but more subtle. I will point out that the following change “works” (the
sshwill execute on each of the hosts in the file):Here the issue is that the
sshcommand buffers up input even if the command on the remote never reads from its stdin. Because the shell/subshell (reading from the pipe) and thesshare sharing the same input stream, thesshis “stealing” most of the input from the pipeline, leaving only the occasional line for thereadcommand.This is not an artificial question. I actually encountered it in my work and had to figure it out. I know from experience that understanding this second example is at least a notch or two above the first. I also know, also from years of experience, that fewer than 10% of the candidates (for sysadmin and programming positions) that I’ve interviewed can get the first question right away).
I’ve never used the second question in a live interview and I’ve been discouraged from doing so.