I am working with Matlab to use Rocchio Classification method. I have 160 txt documents.
I have calculated term frequency of each word in each document, so now have a 1×160 cell array “Set” which consist of 160 cells with a number of integers in each cell (Terms Frequencies of each word in a document).
I am trying to take each integer i, and apply next formula 1+log10(i), to calculate term frequency weighting. I came up with next code:
function [tfw]=TFWeighting(Set)
size(Set);
TFW=cell(0);
for i=1:size(Set)
for j=1:size(Set{1,i})
TFW{1,i}(j,1) = 1+log10(Set{1,i}(j,1));
end
end
tfw=TFW;
end
Well, it works but only for the first cell. All other 159 cells are untouched.
What might be the problem?
this line:
is your culprit.
size(Set) is [1 160], so MATLAB says for i = 1:1;
You want:
The same potential bug happens a few lines later:
Without knowing whats in your Set, it’s hard to say, but I bet you can probably speed this whole thing up by removing the inner loop, and using MATLAB’s ability to process whole vectors or matrices at once:
If you want to be SUPER fancy, here’s a one-line solution