Is there any general format for label-inputs in scikit-learn datasets? I see it have

Question

0

Asked: June 17, 20262026-06-17T21:05:35+00:00 2026-06-17T21:05:35+00:00

Is there any general format for label-inputs in scikit-learn datasets? I see it have

0

Is there any general format for label-inputs in scikit-learn datasets? I see it have list of labels for output in target_names. I want to follow scikit conventions and keep some data about labels in input vars (e.g. sex). Is there any convention for this allready? Something like this

>>> data_set.inputs["sex"]
{'male': 1, 'female': 0}

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-17T21:05:36+00:00

Editorial Team

2026-06-17T21:05:36+00:00Added an answer on June 17, 2026 at 9:05 pm

There no convention for storing categorical feature name information. You are free to do as you wish.

Alternatively you can just store the original data with original format and use DictVectorizer / FeatureHasher and LabelBinarizer on the fly when you need to build a model from the data.

0

Reply
Share
Share

- Report

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

Is there any general format for label-inputs in scikit-learn datasets? I see it have

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply