I need to iteratively extend a weka ARFF file with SparseInstance objects. Each time a new SparseInstance is added the header might change since the new Instance might add additional attributes. I thought the mergeInstances method would solve my problem but it does not. It requires both dataset to have no shared attributes.
If this is not absolutely clear look at the following example:
Dataset1
a b c
1 2 3
4 5 6
Dataset2
c d
7 8
Merged result:
a b c d
1 2 3 ?
4 5 6 ?
? ? 7 8
The only solution I see at the moment is parsing the arff file by hand and merging it using String processing. Does anyone know of a better solution?
Ok. I found the solution myself. The central part of the solution is the method
Instances#insertAttributeAt, which inserts a new attribute as the last one if the second parameter ismodel.numAttributes(). Here is some example code for numerical attributes. It is easy to adapt to other types of attributes as well: