Both users and pages on my website have IDs. When a user goes on a certain page, their userID and the pageID will be written to a MySQL table as such:
userID | pageID
3 | 1
2 | 1
3 | 2
etc...
In this table, called user_pages, I would end up with a bunch of raw data that can be turned into a recommendation engine. What I mean by recommendation engine – I want to analyze historical data, and be able to predict, based on a set of viewed pages, the next pages that a user may like. Let’s say there is a strong correlation between visiting page with ID 3 after going to pages with IDs 4, 9, 15. If a user goes on pages 4, 9, and 15, then the engine should recommend page 3.
I think I have all of the data input code necessary for creating this. How would I write something that analyzes the data for correlation of pages (i.e. almost everyone who visited page 5 visited page 1 also), and somehow use that to predict in the future the pages that a user may end up liking?
Recommendation systems are a big part of A.I research. I believe you are interested in a collection of algorithms called collaborative filtering. Since the netflix prize in 2007 this field has developed greatly. I would recommend going here and having a read. It explains the basic concepts of recommender systems in a succinct and clear way and also provides a link to Java source code for an approach to the Netflix project, MemReader. You could examine this source code and extrapolate the basic algorithms for building a recommendation engine.
Alternatively if you want a more mathematical explanation of the algorithms employed go here.
It shouldn’t take too long to implement at all.