Alright SO users. I am trying to learn and use CouchDB. I have the StackExchange data export loaded as document per row from the XML file, so the documents in couch look basically like this:
//This is a representation of a question:
{
"Id" : "1",
"PostTypeId" : "1",
"Body" : "..."
}
//This is a representation of an answer
{
"Id" : "1234",
"ParentId" : "1",
"PostTypeId" : "2"
"Body" : "..."
}
(Please ignore the fact that the import of these documents basically treated all the attributes as text, I understand that using real numbers, bools, etc. could yield better space/processing efficiency.)
What I’d like to do is to map this into a single aggregate document:
Here’s my map:
function(doc) {
if(doc.PostTypeId === "2"){
emit(doc.ParentId, doc);
}
else{
emit(doc.Id, doc);
}
}
And here’s the reduce:
function(keys, values, rereduce){
var retval = {question: null, answers : []};
if(rereduce){
for(var i in values){
var current = values[i];
retval.answers = retval.answers.concat(current.answers);
if(retval.question === null && current.question !== null){
retval.question = current.question;
}
}
}
else{
for(var i in values){
var current = values[i];
if(current.PostTypeId === "2"){
retval.push(current);
}
else{
retval.question = current;
}
}
}
return retval;
}
Theoretically, this would yield a document like this:
{
"question" : {...},
"answers" : [answer1, answer2, answer3]
}
But instead I am getting the standard “does not reduce fast enough” error.
Am I using Map-Reduce incorrectly, is there a well-established pattern for how to accomplish this in CouchDb?
(Please also note that I would like a response with the complete documents, where the question is the “parent” and the answers are the “children”, not just the Ids.)
So, the “right” way to accomplish what I’m trying to do above is to add a “list” as part of my design document. (and the end I am trying to achieve appears to be referred to as “collating documents”).
At any rate, you can configure your map however you like, and combine it with an a “list” in the same function.
To solve the above question, I eliminated my reduce (only have a map function), and then added a function like the following:
So, after you have some elements loaded up, you can access them like so:
I hope this saves people some time.