A several line document has a header/title section and then about 10 listings under each. I need to put the header/title info in with each of the listings so that they can be properly uploaded into a website (using comma and pipe delimiters). It looks like this:
SectionName1 and TitleName1
1111 - The SubSectionName A
222 - The SubSectionName B
3333 - The SubSectionName C
SectionName2 and TitleName2
444 - The SubSectionName D
55555 - The SubSectionName E
66 - The SubSectionName F
Repeating several hundred times. What I need is to produce something like:
SectionName1,TitleName1,1111,SubSectionNameA
SectionName1,TitleName1,222,SubSectionNameB
SectionName1,TitleName1,3333,SubSectionNameC
SectionName2,TitleName2,444,SubSectionNameD
SectionName2,TitleName2,55555,SubSectionNameE
SectionName2,TitleName2,66,SubSectionNameF
I realize there can multiple approaches to this solution, but I’m having a difficult time pulling the trigger on any one method. I understand submatches, joins and getline but I am not good at practical use of them in this scenario.
Any help to get me mentally started would be greatly appreciated.
Let me propose the following quite general Ex command solving the
issue.1
At the top level, this is the
:globalcommand that enumerates the linesstarting with zero or more whitespace characters followed by a Latin letter or
an underscore (see
:help /\h). The lines matching this pattern are supposedto be the header lines containing section and title names. The rest of the
command, after the pattern describing the header lines, are instructions to be
executed for each of those lines.
The actions to be performed on the headers can be divided into three steps.
Delete the current header line, at the same time extracting section
and title names from it.
First, remove the current line, saving it into the unnamed register,
using the
:deletecommand. Then, update the contents of thatregister (referred to as
@"; see:help @rand:help "") to beresult of the substitution changing the word
andsurrounded bywhitespace characters, to a single comma. The actual replacement is
carried out by the
substitute()function.However, the input is not the exact string containing the whole header
line, but its prefix leaving out the last character, which is
a newline symbol. The
[:-2]notation is a short form of the[0:-2]subscript expression that designates the substring from thevery first byte to the second one counting from the end (see
:help). This way, the unnamed register holds the section and theexpr-[:]
title names separated by comma.
Determine the range of dependent subsection lines.
After the first step, the subsection records belonging to the just
parsed header line are located starting from the current line (the one
followed the header) until the next header line or, if there is no
such line below, the end of buffer. The numbers of these lines are
stored in the marks
iandj, respectively. (See:helpg ^A markfor description of marks.)is
The marks are placed using the
:kcommand that sets a specified markat the last line of a given range which is the current line, by
default. So, unlike the first line of the considered block, the last
one requires a specific line range to point out its location.
A particular form of range, denoting the next line where a given
pattern matches, is used in this case (see
:help :range). Thepattern defining the location of the line to be found, is composed in
such a way that it matches a line immediately preceding a header (a
line starting with possible whitespace followed by an alphabetical
character), or the very last line. (See
:help patternfor detailsabout syntax of Vim regular expressions.)
Transform the delineated subsection lines according to desired format,
prepending section and title names found in the corresponding header
line.
This step comprised of the two
:substitutecommands that are runover the range of lines delimited by the locations labelled by the
marks
iandj(see:help [range]).The first substitution command matches the beginning of a subsection
line—an identifier followed by a hyphen and the word
The, allfloating in a whitespace—and replaces it with the contents of the
unnamed register, holding the section and title names concatenated
with a comma, the matched identifier, and another comma. The second
substitution finalizes the transformation by squeezing all whitespace
characters on the line to gum the subsection name and the following
letter together.
To construct the replacement string in the first
:substitutecommand, the substitute-with-an-expression feature is used (see
:help). The substitution part of the command should startsub-replace-\=
with
\=for Vim to interpret the remaining text not in a regularway, but as an expression (see
:help expression). The result ofthat expression’s evaluation becomes the substitution string. Note
the use of the
submatch()function in the substitute expression toretrieve the text of a submatch by its number.
1 The command is wrapped for better readability, its one-line
version is listed below for ease of copy-pasting into Vim command line. Note
that the wrapped command can be used in a Vim script without any change.