Using encoding/xml.Decoder I’m attempting to manually parse an XML file loaded from http://www.khronos.org/files/collada_schema_1_4
For test purposes, I’m just iterating over the document printing out whatever token type is encountered:
func Test (r io.Reader) {
var t xml.Token
var pa *xml.Attr
var a xml.Attr
var co xml.Comment
var cd xml.CharData
var se xml.StartElement
var pi xml.ProcInst
var ee xml.EndElement
var is bool
var xd = xml.NewDecoder(r)
for i := 0; i < 24; i++ {
if t, err = xd.Token(); (err == nil) && (t != nil) {
if a, is = t.(xml.Attr); is { print("ATTR\t"); println(a.Name.Local) }
if pa, is = t.(*xml.Attr); is { print("*ATTR\t"); println(pa) }
if co, is = t.(xml.Comment); is { print("COMNT\t"); println(co) }
if cd, is = t.(xml.CharData); is { print("CDATA\t"); println(cd) }
if pi, is = t.(xml.ProcInst); is { print("PROCI\t"); println(pi.Target) }
if se, is = t.(xml.StartElement); is { print("START\t"); println(se.Name.Local) }
if ee, is = t.(xml.EndElement); is { print("END\t\t"); println(ee.Name.Local) }
}
}
}
Now here’s the output:
PROCI xml
CDATA [1/64]0xf84004e050
START schema
CDATA [2/129]0xf84004d090
COMNT [29/129]0xf84004d090
CDATA [2/129]0xf84004d090
START annotation
CDATA [3/129]0xf84004d090
START documentation
CDATA [641/1039]0xf840061000
END documentation
CDATA [2/1039]0xf840061000
END annotation
CDATA [2/1039]0xf840061000
COMNT [37/1039]0xf840061000
CDATA [2/1039]0xf840061000
START import
END import
CDATA [2/1039]0xf840061000
COMNT [14/1039]0xf840061000
CDATA [2/1039]0xf840061000
START element
CDATA [3/1039]0xf840061000
START annotation
Notice no ATTR or *ATTR lines are output even though by the last (24th) line many attributes have been passed both in the root xs:schema element as well as in xs:import and xs:element elements.
This is in Go 1.0.3 64-bit under Windows 7 64-bit. Am I doing something wrong or should I file a Go package bug report?
[Side note: when doing a normal xml.Unmarshal into properly prepared structs, known-named-and-mapped attributes are captured and mapped by the xml package just fine. But I also need to collect “unknown” attributes in the root element (to collect namespace information for this use-case, the use-case being http://github.com/metaleap/go-xsd ), hence my attempts to use Decoder.Token().]
Yes, this behavior is expected. The attributes are parsed, but
not returned as a xml.Token. Attributes simply arn’t Tokens.
See: http://golang.org/pkg/encoding/xml/#Token
The attributes are accessible through the Attr field in
the Token StartElement.
See: http://golang.org/pkg/encoding/xml/#StartElement
(( Some general hints:
a) Do not use print or println.
b) The a, ok := t.(SomeType) idioma is called “comma okay”, because the boolean is normaly named “ok”, not “is”. Please stick to these conventions.
c) Idiomatic would be something like
instead of your list of “if a, is = t.(xml.Attr) …”
d) All this “var se xml.StartElement” is noise (clutter). Use
This would make your code much readable. ))