Scikit-learn has fairly user-friendly python modules for machine learning. I am trying to train

Question

0

Asked: June 13, 20262026-06-13T05:07:57+00:00 2026-06-13T05:07:57+00:00

Scikit-learn has fairly user-friendly python modules for machine learning. I am trying to train

0

Scikit-learn has fairly user-friendly python modules for machine learning.

I am trying to train an SVM tagger for Natural Language Processing (NLP) where my labels and input data are words and annotation. E.g. Part-Of-Speech tagging, rather than using double/integer data as input tuples [[1,2], [2,0]], my tuples will look like this [['word','NOUN'], ['young', 'adjective']]

Can anyone give an example of how i can use the SVM with string tuples? the tutorial/documentation given here are for integer/double inputs. http://scikit-learn.org/stable/modules/svm.html

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-13T05:08:01+00:00

Most machine learning algorithm process input samples that are vector of floats such that a small (often euclidean) distance between a pair of samples means that the 2 samples are similar in a way that is relevant for the problem at hand.

It is the responsibility of the machine learning practitioner to find a good set of float features to encode. This encoding is domain specific hence there is not general way to build that representation out of the raw data that would work across all application domains (various NLP tasks, computer vision, transaction log analysis…). This part of the machine learning modeling work is called feature extraction. When it involves a lot of manual work, this is often referred to as feature engineering.

Now for your specific problem, POS tags of a window of words around a word of interest in a sentence (e.g. for sequence tagging such as named entity detection) can be encoded appropriately by using the DictVectorizer feature extraction helper class of scikit-learn.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

Scikit-learn has fairly user-friendly python modules for machine learning. I am trying to train

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply