I’ve started using opencv to detect features:
A sequence like:
cv::SurfFeatureDetector detector( 40 );
std::vector<cv::KeyPoint> keypoints_object;
detector.detect( img_object, keypoints_object );
//-- Step 2: Calculate descriptors (feature vectors)
cv::SurfDescriptorExtractor extractor;
cv::Mat descriptors_object, descriptors_scene;
extractor.compute( img_object, keypoints_object, descriptors_object );
//-- Step 3: Matching descriptor vectors using FLANN matcher
will extract features of the image which can be matched against features extracted from other images. What does the term ‘training image’ mean in this context.
Do I have to rotate and/or scale the image multiple times?
If so. can the features be merged to a single descriptor?
Training image in the context of feature extraction makes me think of classification. There you have a set of training images from different classes from which you extract features. Then you try to learn some kind of classifier using this features. So you used these images to train a classifier.
SURF Features are scale and rotational invariant. So there is no need to scale or rotate an image.
To get one single feature vector per image you could use a bag of words model.