I want to parse the text line from the Wavefront OBJ file. Currently I am interested in “V” and “F” types only.
My algorithm is as follows:
- check if line is not nil (otherwise step 2 would fail)
- drop comment after “#” and trim spaces
- drop prefix “v ” or “f “
- split string to the list of elements where each element
- is split to the list if it is symbol like |34/76/23|
- is converted from the list: I take one element only, the first by default
- or coerced to the given type if it is atomic number already.
Here is the code:
(defun parse-line (line prefix &key (type 'single-float))
(declare (optimize (debug 3)))
(labels ((rfs (what)
(read-from-string (concatenate 'string "(" what ")")))
(unpack (str &key (char #\/) (n 0))
(let ((*readtable* (copy-readtable)))
(when char ;; we make the given char a delimiter (space)
(set-syntax-from-char char #\Space))
(typecase str
;; string -> list of possibly symbols.
;; all elements are preserved by (map). nil's are dropped
(string (delete-if #'null
(map 'list
#'unpack
(rfs str))))
;; symbol -> list of values
(symbol (unpack (rfs (symbol-name str))))
;; list -> value (only the requested one)
(list (unpack (nth n str)))
;; value -> just coerce to type
(number (coerce str type))))))
(and line
(setf line (string-trim '(#\Space #\Tab)
(subseq line 0 (position #\# line))))
(< (length prefix) (length line))
(string= line prefix :end1 (length prefix) :end2 (length prefix))
(setf line (subseq line (length prefix)))
(let ((value (unpack line :char nil)))
(case (length value)
(3 value)
(4 (values (subseq value 0 3) ;; split quad 0-1-2-3 on tri 0-1-2 + tri 0-2-3
(list (nth 0 value)
(nth 2 value)
(nth 3 value)))))))))
Step four (label “unpack”) is kind of recursive. It is one function and can call itself three times.
Anyway, this solution seems to be clunky.
My question is: how should one solve this task with shorter and clearer code?
I would approach this in a more structured manner.
You want to parse an obj file into some sort of data structure:
You need to think about how the data structure returned should look. For now, let us return a list of two lists, one of the vertices, one of the faces. The parser will go through each line, determine whether it is either a vertex or a face, and then collect it into the appropriate list:
I used the
cl-ppcrelibrary here, but you could also usemismatchorsearch. You will then need to defineparse-vertexandparse-face, for whichcl-ppcre:splitshould come in quite handy.It would perhaps also be useful to define classes for vertices and faces.
Update: This is how I would approach vertices:
Parse-numberis from theparse-numberlibrary. It is better than usingread.Update 2: (Sorry for making this a run-on story; I have to interlace some work.) A face consists of a list of face-points.
Remove-commentsimply throws away everything after the first#: