I am new to sklearn and in general to python as well. Can you help me figure out if this script is leading to some solution? Basically I am using an hue extractor on an Imageset: load iset for training, extract features, define classifier and then classify.
#load beach for training
iset = ImageSet('/Users/Arenzky/Desktop/img_dirs/supervised/beach/') #load Image database
hue = HueHistogramFeatureExtractor() # define extractor
edge = EdgeHistogramFeatureExtractor()
x = []
y = []
for b in iset:
...: x.append(hue.extract(b))
hset = ImageSet('/dir/.../h01/')
hue = HueHistogramFeatureExtractor() # define extractor
edge = EdgeHistogramFeatureExtractor()
for h01 in hset:
...: y.append(hue.extract(h01))
dataset = np.array(x)
targets = np.array(y)
print 'Training Machine Learning'
clf = LinearSVC()
clf = clf.fit(x, y)
clf2 = LogisticRegression().fit(x, y)
#predict
…
after loading clf I get:
ValueError:
X and Y have incompatible shapes. X has 20 samples, but Y has 286.
The error message is pretty explicit: you have 20 samples (rows) in your input datasets and 286 labels hence the mismatch. Each sample should be labeled once hence
y.shape[0]should be equal tox.shape[0]. I don’t know how your feature extractors work (and you did not put the import lines but by googling it’s from SimpleCV). Please refer to the documentation of this module to understand the nature of their output and how to transform them to something that satisfy the sklearn shapes assumptions.