Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 6751163
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 26, 20262026-05-26T12:53:40+00:00 2026-05-26T12:53:40+00:00

Let’s say I have an array k = [1 2 0 0 5 4

  • 0

Let’s say I have an array
k = [1 2 0 0 5 4 0]

I can compute a mask as follows
m = k > 0 = [1 1 0 0 1 1 0]

Using only the mask m and the following operations

  1. Shift left / right
  2. And/Or
  3. Add/Subtract/Multiply

I can compact k into the following
[1 2 5 4]

Here’s how I currently do it (MATLAB pseudocode):

function out = compact( in )
    d = in
    for i = 1:size(in, 2) %do (# of items in in) passes
        m = d > 0
        %shift left, pad w/ 0 on right
        ml = [m(2:end) 0] % shift
        dl = [d(2:end) 0] % shift

        %if the data originally has a gap, fill it in w/ the 
        %left shifted one
        use = (m == 0) & (ml == 1) %2 comparison  

        d = use .* dl + ~use .* d

        %zero out elements that have been moved to the left
        use_r = [0 use(1:end-1)]
        d = d .* ~use_r
    end

    out = d(1 : size(find(in > 0), 2)) %truncate the end
end

Intuition

Each iteration, we shift the mask left and compare the mask. We set a index to have the left shifted data if we find that after this shift, an index that was originally void(mask[i] = 0) is now valid(mask[i] = 1).

Question

The above algorithm has O(N * (3 shift + 2 comparison + AND + add + 3 multiplies)). Is there a way to improve its efficiency?

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-26T12:53:40+00:00Added an answer on May 26, 2026 at 12:53 pm

    There is no much to optimize in the original pseudo-code. I see several small improvements here:

    • loop may perform one iteration less (i.e. size-1),
    • if ‘use’ is zero, you may break the loop early,
    • use = (m == 0) & (ml == 1) probably may be simplified to use = ~m & ml,
    • if ~ is counted as separate operation, it would be better to use the inverted form : use = m | ~ml, d = ~use .* dl + use .* d, use_r = [1 use(1:end-1)], d = d .*use_r

    But it is possible to invent better algorithms. And the choice of algorithm depends on CPU resources used:

    • Load-Store Unit, i.e. apply algorithm directly to memory words. Nothing can be done here until chipmakers add highly parallel SCATTER instruction to their instruction sets.
    • SSE registers, i.e. algorithms working on entire 16 bytes of the registers. Algorithms like the proposed pseudo-code cannot help here because we already have various shuffle/permute instructions which make the work better. Using various compare instructions with PMOVMSKB, grouping the result by 4 bits and applying various shuffle instructions under switch/case (as described by LastCoder) is the best we can do.
    • SSE/AVX registers with latest instruction sets allow a better approach. We can use the result of PMOVMSKB directly, transforming it to the control register for something like PSHUFB.
    • Integer registers, i.e. GPR registers or working simultaneously on several DWORD/QWORD parts of SSE/AVX registers (which allows to perform several independent compactions). The proposed pseudo-code applied to integer registers allows to compact binary subsets of any length (from 2 to 20 bits). Here is my algorithm, which is likely to perform better.

    C++, 64 bit, subset width = 8:

    typedef unsigned long long ull;
    const ull h = 0x8080808080808080;
    const ull l = 0x0101010101010101;
    const ull end = 0xffffffffffffffff;
    
    // uncompacted bytes
    ull x = 0x0100802300887700;
    
    // set hi bit for zero bytes (see D.Knuth, volume 4)
    ull m = h & ~(x | ((x|h) - l));
    
    // bitmask for nonzero bytes
    m = ~(m | (m - (m>>7)));
    
    // tail zero bytes need no special treatment
    m |= (m - 1);
    
    while (m != end)
    {
      ull tailm = m ^ (m + 1); // bytes to be processed
      ull tailx = x & tailm; // get the bytes
      tailm |= (tailm << 8); // shift 1 byte at a time
      m |= tailm; // all processed bytes are masked
      x = (x ^ tailx) | (tailx << 8); // actual byte shift
    }
    
    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

Let's say I have a dataset, which can be neatly classified using weka's J48
Let's say I have the following function in C#: void ProcessResults() { using (FormProgress
Let's say you have a class called Customer, which contains the following fields: UserName
Let's say I have the following text: (example) <table> <tr> <td> <span>col1</span> </td> <td>col2</td>
Let's say I have an facebook application running using the JS SDK. First user
Let's say I have some text as follows: do this, do that, then this,
let's say I have the following string: string s = A B C D
Let's say I have the following entity: public class Store { public List<Product> Products
Let's say, I have a .NET 2 installed. Can I programmatically install version 4
Let say I have the following desire, to simplify the IConvertible's to allow me

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.