The following instance method takes a file path and returns the file’s prefix (the part before the separator):
@separator = "@"
def table_name path
regex = Regexp.new("\/[^\/]+#{@separator}")
path.match(regex)[0].gsub(/^.|.$/,'').downcase.to_sym
end
table_name "bla/bla/bla/Prefix@invoice.csv"
# => :prefix
So far, this method only works on Unix. To make it work on Windows, I also need to capture the backslash (\). Unfortunately, that’s when I got stuck:
@separator = "@"
def table_name path
regex = Regexp.new("(\/|\\)[^\/\\]+#{@separator}")
path.match(regex)[0].gsub(/^.|.$/,'').downcase.to_sym
end
table_name("bla/bla/bla/Prefix@invoice.csv")
# RegexpError: premature end of char-class: /(\/|\)[^\/\]+@/
# Target result:
table_name("bla/bla/bla/Prefix@invoice.csv")
# => :prefix
table_name("bla\bla\bla\Prefix@invoice.csv")
# => :prefix
I suspect Ruby’s string interpolation and escaping is what confuses me here.
How could I change the Regex to make it work on both Unix and Windows?
I don’t actually know what
bla/bla/bla/Prefix@invoice.csvrefers to; isbla/bla/bla/blaall directories, and the filenamePrefix@invoice.csv?With the assumption that I’ve correctly understood your filenames, I suggest using
File.split():Not only is it platform-agnostic, it is more legible too.
Update
You piqued my curiosity:
You’re right, the
\must be double-escaped for it to work in a regular expression: once to get past the interpreter, again to get past the regex engine. (Definitely feels awkward.) The regex is:The string is:
The regex, which might be too brittle for real use (how would it handle a path without
/or\path separators or a pathname without@or with too many@?), looks for any number of characters, a single path separator, any amount of non-@, an @, then any amount of any characters. I’m assuming that the first.+will greedily consume as many characters as possible to make the match as far to the right as possible:But depending upon malformed input data, it might do the very wrong thing.