I need to define a bunch of vector sequences, which are all a series of L,D,R,U for left, down, right, up or x for break. There are optional parts, and either/or parts. I have been using my own invented system for noting it down, but I want to document this for other, potentially non-programmers to read.
I now want to use a subset (I don’t plan on using any wildcards, or infinite repetition for example) of regex to define the vector sequence and a script to produce all possible matching strings…
/LDR/ produces ['LDR']
/LDU?R/ produces ['LDR','LDUR']
/R(LD|DR)U/ produces ['RLDU','RDRU']
/DxR[DL]U?RDRU?/ produces ['DxRDRDR','DxRDRDRU','DxRDURDR','DxRDURDRU','DxRLRDR','DxRLRDRU','DxRLURDR','DxRLURDRU']
Is there an existing library I can use to generate all matches?
EDIT
I realised I will only be needing or statements, as optional things can be specified by thing or nothing maybe a, or b, both optional could be (a|b|). Is there another language I could use to define what I am trying to do?
By translating the java code form the link provided by @Dukeling into javascript, I think I have solved my problem…
I only changed one line, to allow more than two consecutive brackets to be handled, and left the original translation in the comments
I also added a function to return an array of results, instead of printing them…
The last part, to allow for empty strings too, so…
(ab|c|)meansaborcornothing, and a convenience shortcut so thatab?cis translated intoa(b|)c.The last part is a little hack-ish as it relies on
µnot being in the string (not an issue for me) and solves one bug, where a)at the end on the input string was causing incorrect output, by inserting aµat the end of each string, and then stripping it from the results. I would be happy for someone to suggest a better way to handle these issues, so it can work as a more general solution.This code as it stands does everything I need. Thanks for all your help!