i’m kinda new to vectorization. Have tried myself but couldn’t. Can somebody help me vectorize this code as well as give a short explaination on how u do it, so that i can adapt the thinking process too. Thanks.
function [result] = newHitTest (point,Polygon,r,tol,stepSize)
%This function calculates whether a point is allowed.
%First is a quick test is done by calculating the distance from point to
%each point of the polygon. If that distance is smaller than range "r",
%the point is not allowed. This will slow down the algorithm at some
%points, but will greatly speed it up in others because less calls to the
%circleTest routine are needed.
polySize=size(Polygon,1);
testCounter=0;
for i=1:polySize
d = sqrt(sum((Polygon(i,:)-point).^2));
if d < tol*r
testCounter=1;
break
end
end
if testCounter == 0
circleTestResult = circleTest (point,Polygon,r,tol,stepSize);
testCounter = circleTestResult;
end
result = testCounter;
Given the information that
Polygonis 2 dimensional,pointis a row vector and the other variables are scalars, here is the first version of your new function (scroll down to see that there are lots of ways to skin this cat):The thought process for vectorization in Matlab involves trying to operate on as much data as possible using a single command. Most of the basic builtin Matlab functions operate very efficiently on multi-dimensional data. Using
forloop is the reverse of this, as you are breaking your data down into smaller segments for processing, each of which must be interpreted individually. By resorting to data decomposition usingforloops, you potentially loose some of the massive performance benefits associated with the highly optimised code behind the Matlab builtin functions.The first thing to think about in your example is the conditional break in your main loop. You cannot break from a vectorized process. Instead, calculate all possibilities, make an array of the outcome for each row of your data, then use the
anykeyword to see if any of your rows have signalled that thecircleTestfunction should be called.NOTE: It is not easy to efficiently conditionally break out of a calculation in Matlab. However, as you are just computing a form of Euclidean distance in the loop, you’ll probably see a performance boost by using the vectorized version and calculating all possibilities. If the computation in your loop were more expensive, the input data were large, and you wanted to break out as soon as you hit a certain condition, then a matlab extension made with a compiled language could potentially be much faster than a vectorized version where you might be performing needless calculation. However this is assuming that you know how to program code that matches the performance of the Matlab builtins in a language that compiles to native code.
Back on topic …
The first thing to do is to take the linear difference (
linDiffin the code example) betweenPolygonand your row vectorpoint. To do this in a vectorized manner, the dimensions of the 2 variables must be identical. One way to achieve this is to userepmatto copy each row ofpointto make it the same size asPolygon. However,bsxfunis usually a superior alternative to repmat (as described in this recent SO question), making the code …I rolled your
dvalue into a column ofdby summing across the 2nd axis (note the removal of the array index fromPolygonand the addition of,2in thesumcommand). I then went further and evaluated the logical arraytestLogicalsinline with the calculation of the distance measure. You will quickly see that a downside of heavy vectorisation is that it can make the code less readable to those not familiar with Matlab, but the performance gains are worth it. Comments are pretty necessary.Now, if you want to go completely crazy, you could argue that the test function is so simple now that it warrants use of an ‘anonymous function’ or ‘lambda’ rather than a complete function definition. The test for whether or not it is worth doing the
circleTestdoes not require thestepSizeargument either, which is another reason for perhaps using an anonymous function. You can roll your test into an anonymous function and then jut usecircleTestin your calling script, making the code self documenting to some extent . . .Now everything is vectorised, the use of function handles gives me another idea . . .
If you plan on performing this at multiple points in the code, the repetition of the
ifstatements would get a bit ugly. To stay dry, it seems sensible to put the test with the conditional function into a single function, just as you did in your original post. However, the utility of that function would be very narrow – it would only test if thecircleTestfunction should be executed, and then execute it if needs be.Now imagine that after a while, you have some other conditional functions, just like
circleTest, with their own equivalent ofdoCircleTest. It would be nice to reuse the conditional switching code maybe. For this, make a function like your original that takes a default value, the boolean result of the computationally cheap test function, and the function handle of the expensive conditional function with its associated arguments …You could call this function from your main script with the following . . .
…and the beauty of it is you can use any test, default value, and expensive function. Perhaps a little overkill for this simple example, but it is where my mind wandered when I brought up the idea of using function handles.