I am looking for a way to split a string in bash over a delimiter string, and place the parts in an array.
Simple case:
#!/bin/bash
b="aaaaa/bbbbb/ddd/ffffff"
echo "simple string: $b"
IFS='/' b_split=($b)
echo ;
echo "split"
for i in ${b_split[@]}
do
echo "------ new part ------"
echo "$i"
done
Gives output
simple string: aaaaa/bbbbb/ddd/ffffff
split
------ new part ------
aaaaa
------ new part ------
bbbbb
------ new part ------
ddd
------ new part ------
ffffff
More complex case:
#!/bin/bash
c=$(echo "AA=A"; echo "B=BB"; echo "======="; echo "C==CC"; echo "DD=D"; echo "======="; echo "EEE"; echo "FF";)
echo "more complex string"
echo "$c";
echo ;
echo "split";
IFS='=======' c_split=($c) ;# <---- LINE TO BE CHANGED
for i in ${c_split[@]}
do
echo "------ new part ------"
echo "$i"
done
Gives output:
more complex string
AA=A
B=BB
=======
C==CC
DD=D
=======
EEE
FF
split
------ new part ------
AA
------ new part ------
A
B
------ new part ------
BB
------ new part ------
------ new part ------
------ new part ------
------ new part ------
------ new part ------
------ new part ------
------ new part ------
C
------ new part ------
------ new part ------
CC
DD
------ new part ------
D
------ new part ------
------ new part ------
------ new part ------
------ new part ------
------ new part ------
------ new part ------
------ new part ------
EEE
FF
I would like the second output to be like
------ new part ------
AA=A
B=BB
------ new part ------
C==CC
DD=D
------ new part ------
EEE
FF
I.e. to split the string on a sequence of characters, instead of one. How can I do this?
I am looking for an answer that would only modify this line in the second script:
IFS='=======' c_split=($c) ;# <---- LINE TO BE CHANGED
Introduction
At bottom of this, you will find a function to transform string to an array with following syntax:
For this:
IFSdisambiguationIFSmean Input Field Separators, aslist of characters that could be used as separators.By default, this is set to
\t\n, meaning that any number (greater than zero) of space, tabulation and/or newline could be oneseparator.So with the string:
$' blah foo=bar \nbaz 'Leading and trailing separators would be ignored and this string will contain only 3
parts:
blah,foo=barandbaz.But except for spaces, IFS consider each separator for itself:
Splitting a string using
IFSis possible if you know a valid field separator not used in your string, so you could replace your pattern by this character (by using${var//<pattern>/<separator>}syntax):But this work only while string do not contain any
§.You could use another character, like
IFS=$'\026';c_split=(${c//=======/$'\026'})but anyway this may involve furter bugs.You could browse character maps for finding one who’s not in your string:
but I find this solution a little overkill.
Splitting on spaces (or without modifying IFS)
Under bash, we could use this bashism:
In fact, this syntaxe
${varname//will initiate a translation (delimited by/) replacing all occurences of/by a space, before assigning it to an arrayb_split.Of course, this still use
IFSand split array on spaces.This is not the best way, but could work with specific cases.
You could even drop unwanted spaces before splitting:
or exchange thems…
Splitting line on
delimiter strings:So you have to not use
IFSfor your meaning, but bash do have nice features:Let see:
About Leading newline
Leading and trailing newlines are not deleted in previous samples. For this, you could simply:
instead of
=======.Or you could rewrite split loop for keeping explicitely this out:
Any case, this match what SO question asked for (: and his sample 🙂
Finaly creating an
array.Do this finely:
Some explanations:
export -a varto definevaras an array and share them in childs${variablename%string*},${variablename%%string*}result in the left part of variablename, upto but without string. One%mean last occurence of string and%%for all occurences. Full variablename is returned is string not found.${variablename#*string}, do same in reverse way: return last part of variablename from but without string. One#mean first occurence and two##man all occurences.Nota in replacement, character
*is a joker mean any number of any character.The command
echo "${c%%$'\n'}"would echo variable c but without any number of newline at end of string.So if variable contain
Hello WorldZorGluBHello youZorGluBI'm happy,All this is explained in the manpage:
Step by step, the splitting loop:
The separator:
Declaring
c_splitas an array (and could be shared with childs)While variable c do contain at least one occurence of
mySepTrunc c from first
mySepto end of string and assign topart.Remove leading newlines
Remove trailing newlines and add result as a new array element to
c_split.Reassing c whith the rest of string when left upto
mySepis removedDone 😉
Remove leading newlines
Remove trailing newlines and add result as a new array element to
c_split.Into a function:
Usage:
where array name is
$splitted_arrayby default and delimiter is one single space.You could use: