I’m trying to take a long list of objects (in this case, applications from the iTunes App Store) and classify them more specifically. For instance, there are a bunch of applications currently classified as “Education,” but I’d like to label them as Biology, English, Math, etc.
Is this an AI/Machine Learning problem? I have no background in that area whatsoever but would like some resources or ideas on where to start for this sort of thing.
Yes, you are correct. Classification is a machine learning problem, and classifying stuff based on text data involves natural language processing.
The canonical classification problem is spam detection using a Naive Bayes classifier, which is very simple. The idea is as follows:
I’d highly recommend playing around with NLTK, a python machine learning and nlp library. It’s very user friendly and has good docs and tutorials, and is a good way to get acquainted with the field.
EDIT: Here’s an explanation of how to build a simple NB classifier with code.