I wrote a program in C# to calculate TF-IDF to rank documents. I used

Question

0

Asked: May 31, 20262026-05-31T01:39:48+00:00 2026-05-31T01:39:48+00:00

I wrote a program in C# to calculate TF-IDF to rank documents. I used

0

I wrote a program in C# to calculate TF-IDF to rank documents.

I used the following XML to store the word frequencies within documents. I was criticised heavily for using this structure. Even though I use the text of the word within the Tag, as per me its efficient and consumes less space. Also, I can make a search using XDocument pretty easily since its a nice tree structure. Can you help me understand why was I criticised heavily?

Criticism: How can you add information within meta-data? (For me its innovative).

<word>
   <siddhartha>
      <doc1> 4 </doc4>
      <doc2> 5 </doc2>

   <insipration>
      <doc1> 4 </doc1>
      <doc6> 5 </doc6>

   ....
</word>

I was suggested something like this:

   <word>
   <text> siddhartha </text>
   <doc1> 4 </doc1>
   <text> inspiration </text>
   <doc1> 4 </doc1>
   ...
   </word>

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-31T01:39:49+00:00

Editorial Team

2026-05-31T01:39:49+00:00Added an answer on May 31, 2026 at 1:39 am

Your structure, with word name as node, will be hard to parse with generic parsers. There is no defined structure: you need to read the whole document to know it.

I may have done something like this (I tried to stay closed to your idea):

<words>
   <word id="siddhartha">
      <freq id="doc1"> 4 </freq>
      <freq id="doc2"> 5 </freq>
   </word>
   ....
</words>

0

Reply
Share
Share

- Report

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I wrote a program in C# to calculate TF-IDF to rank documents. I used

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply