What’s the best way to handle missing feature attribute values with Weka’s C4.5 (J48) decision tree? The problem of missing values occurs during both training and classification.
-
If values are missing from training instances, am I correct in assuming that I place a ‘?’ value for the feature?
-
Suppose that I am able to successfully build the decision tree and then create my own tree code in C++ or Java from Weka’s tree structure. During classification time, if I am trying to classify a new instance, what value do I put for features that have missing values? How would I descend the tree past a decision node for which I have an unknown value?
Would using Naive Bayes be better for handling missing values? I would just assign a very small non-zero probability for them, right?
From Pedro Domingos’ ML course in University of Washington:
Here are three approaches what Pedro suggests for missing value of
A:Aamong other examples sorted to nodenAamong other examples with same target valuep_ito each possible valuev_iofA; Assign fractionp_iof example to each descendant in tree.The slides and video is now viewable at here.