I have 2 input variables: a vector of p-values ( p ) with N

Question

0

Editorial Team

Asked: May 29, 20262026-05-29T08:05:11+00:00 2026-05-29T08:05:11+00:00

I have 2 input variables: a vector of p-values ( p ) with N

0

I have 2 input variables:

a vector of p-values (p) with N elements (unsorted)
and N x M matrix with p-values obtained by random permutations (pr) with M iterations. N is quite large, 10K to 100K or more. M let’s say 100.

I’m estimating the False Discovery Rate (FDR) for each element of p representing how many p-values from random permutations will pass if the current p-value (from p) will be the threshold.

I wrote the function with ARRAYFUN, but it takes lot of time for large N (2 min for N=20K), comparable to for-loop.

function pfdr = fdr_from_random_permutations(p, pr)
%# ... skipping arguments checks
pfdr = arrayfun( @(x) mean(sum(pr<=x))./sum(p<=x), p);

Any ideas how to make it faster?

Comments about statistical issues here are also welcome.

The test data can be generated as p = rand(N,1); pr = rand(N,M);.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-29T08:05:12+00:00

Well, the trick was indeed sorting the vectors. I give credit to @EgonGeerardyn for that. Also, there is no need to use mean. You can just divide everything afterwards by M. When p is sorted, finding the amount of values that are less than current x, is just a running index. pr is a more interesting case – I used a running index called place to discover how many elements are less than x.

Edit(2): Here is the fastest version I come up with:

 function Speedup2()
    N = 10000/4 ;
    M = 100/4 ;
    p = rand(N,1); pr = rand(N,M);

    tic
    pfdr = arrayfun( @(x) mean(sum(pr<=x))./sum(p<=x), p);
    toc

    tic
    out = zeros(numel(p),1);
    [p,sortIndex] = sort(p);
    pr = sort(pr(:));
    pr(end+1) = Inf;
    place = 1;
    N =  numel(pr);
    for i=1:numel(p)
        x = p(i);
        while pr(place)<=x
            place = place+1;
        end
        exp1a = place-1;
        exp2 = i;
        out(i) = exp1a/exp2;
    end
    out(sortIndex) = out/ M;
    toc
    disp(max(abs(pfdr-out)));

end

And the benchmark results for N = 10000/4 ; M = 100/4 :

Elapsed time is 0.898689 seconds.
Elapsed time is 0.007697 seconds.
2.220446049250313e-016

and for N = 10000 ; M = 100 ;

Elapsed time is 39.730695 seconds.
Elapsed time is 0.088870 seconds.
2.220446049250313e-016

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I have 2 input variables: a vector of p-values ( p ) with N

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply