Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8546509
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 11, 20262026-06-11T13:02:36+00:00 2026-06-11T13:02:36+00:00

I am implementing several datastructures and one primitive I want to use is the

  • 0

I am implementing several datastructures and one primitive I want to use is the following: I have a memory chunk A[N] (it has a variable length, but I take 100 for my examples) and inside this chunk, there is a smaller part C of length K (lets say 30) which I want to move without using any additional memory.

The additional difficulty is, that A “wraps”, that is, C can start at A[80] and then the first 20 elements of C are the elements A[80..100] and the last 10 elements are the elements A[0..10]. Furthermore, the target range may also “wrap” and overlap with C in any possible way. Additionally, I don’t want to use more than a constant amount of additional memory, everything should happen in place. Also, the part of A which is neither in the target range nor in the source range may contain something important, so it cannot be used either. So one case would be the following:

A looks like this:

|456789ABCDEF0123456789AB|—–|0123|

And should be transformed to this:

|89AB|—–|0123456789ABCDEF01234567|

Just delegating it to a library or use another datastructure from a library is not an option here, I want to understand the problem myself. On the first sight, I thought that it might not be trivial, but as soon as you distinguish a few cases, it becomes clear, but now I am having serious trouble. Of course there are the trivial cases if they don’t overlap or don’t wrap, but at least if both happens at the same time, it gets messy. You could start with one free place and move the part that belongs there, but then you create another free part somewhere else and it gets hard to keep track of which parts you can stil use.

Maybe I am missing something completely, but even my special case if the target range does not wrap has almost 100 lines (half of it are assertions and comments, though) and I could update it so that it also handles the general case with some additional index calculations, but if someone has an elegant and short solution, I would appreciate some help. Intuitively I think that this should somehow be trivial, but I just don’t see the best solution yet.

Note: The interesting case is of course, if C is almost as big as A. If |C| < N/2, it is trivial.

edit: Using more than a constant amount of additional flags/indices counts as additional memory and I want to avoid that if possible.

edit: Some people wanted to see my code. My question is rather abstract, so I didn’t want to post it, but maybe someone sees how to improve it. It is terrible, it only works for the case that the target starts at the beginning (however, that can easily be changed) and terribly long, but it does the job without additional memory in O(n).

#include <stddef.h>
#include <stdio.h>
#include <string.h>
#include <assert.h>

void move_part(int* A, size_t N, size_t target, size_t source, size_t size, int show_steps)
{
  assert(source + size <= N);
  assert(target + size <= N);
  if (show_steps) {
    printf("Moving size %d from %d to %d.\n", size, source, target);
  }
  memmove(A + target, A + source, size * sizeof(int));
}

void swap_parts(int* A, size_t N, size_t first_begin, size_t second_begin, size_t size, int show_steps)
{
  if (show_steps) {
    printf("Swapping size %d at %d and %d.\n", size, first_begin, second_begin);
  }
  assert(first_begin + size <= N);
  assert(second_begin + size <= N);
  size_t i;
  for (i = 0; i < size; ++i) {
    int x = A[first_begin + i];
    A[first_begin + i] = A[second_begin + i];
    A[second_begin + i] = x;
  }
}

void move_to_beginning(int* A, size_t N, size_t begin, size_t size, int show_steps)
{
  assert(begin <= N);
  assert(size <= N);
  // Denotes the start of our "working range". Increases during
  // the algorithm and becomes N
  size_t part_start = 0;
  // Note: Keeping the size is crucial since begin == end could
  // mean that the range is empty or full.
  size_t end = (begin + size) % N;
  while (part_start != N) {
    size_t i;
    if (show_steps) {
      for (i = 0; i < N; ++i) {
    printf("%d ", A[i]);
      }
      printf("\n");
      printf("part_start %d  begin %d  end %d  size %d\n", part_start, begin, end, size);
    }
    // loop invariants
    assert(part_start < N);
    // The two pointers are in our range
    assert(part_start <= begin && begin <= N);
    assert(part_start <= end && end <= N);
    // size is valid (wrapped case, non-empty, non-full case)
    assert(begin <= end || (N - begin) + (end - part_start) == size);
    // size is valid (non wrapped case, non-empty, non-full case)
    assert(begin >= end || end - begin == size);
    // size is valid (working range is full or empty case)
    assert(begin != end || size == 0 || part_start + size == N);
    if (size == 0 || begin == N || begin == part_start) {
      // ##|1234|# -> 1234### ||
      if (show_steps) {
    printf("Case 1:\nTerminating\n");
      }
      // #||# -> ## ||
      // 12|##| -> 12## ||
      // |12|## -> 12## ||
      break;
      /* Not necessary any more, but would be the correct transformation:
     part_start = N;
     begin = N;
     end = N;
     size = 0;*/
    } else if (end == part_start) {
      // |##|123 -> ##|123|
      if (show_steps) {
    printf("Case 2:\n");
    printf("Setting end to %d.\n", N);
      }
      end = N;
    } else if (begin < end) {
      // ##|1234|# -> 1234### ||
      if (show_steps) {
    printf("Case 3:\n");
      }
      move_part(A, N, part_start, begin, size, show_steps);
      break;
      /* Not necessary any more, but would be the correct transformation:
     part_start = N;
     begin = N;
     end = N;
     size = 0;*/
    } else {
      size_t end_size = end - part_start;
      size_t begin_size = N - begin;
      assert(begin_size + end_size == size);
      if (end_size >= begin_size) {
    // 345|#|12 -> 12 5|#|34
    if (show_steps) {
      printf("Case 4:\n");
    }
    swap_parts(A, N, part_start, begin, begin_size, show_steps);
    assert(begin_size > 0); // Necessary for progress
    part_start += begin_size;
    size = end_size;
    // begin, end remain unchanged
      } else if (begin - part_start <= begin_size) {
    // 56|#|1234 -> 123 56|#|4
    size_t size_moved = begin - part_start;
    assert(size_moved >= end_size); // else the next step would be more efficient
    if (show_steps) {
      printf("Case 5\n");
    }
    swap_parts(A, N, part_start, begin, end_size, show_steps);
    move_part(A, N, end, begin + end_size, begin - end, show_steps);
    assert(end_size + (begin - end) == size_moved);
    size -= size_moved;
    part_start = begin;
    begin += size_moved;
    end += size_moved;
      } else if (end_size <= begin_size) {
    // 45|##|123 -> 123 #|45|# 
    if (show_steps) {
      printf("Case 6\n");
    }
    swap_parts(A, N, part_start, begin, end_size, show_steps);
    move_part(A, N, end, begin + end_size, begin_size - end_size, show_steps);
    part_start += begin_size;
    size = end_size;
    end = begin + end_size;
    // begin remains unchanged
      } else {
    // No case applies, this should never happen
    assert(0);
      }
    }
  }
}


int main()
{
  int N = 20;
  int A[20];
  size_t size = 17;
  size_t begin = 15;
  size_t i;
  for (i = 0; i < size; ++i) {
    A[(begin + i) % N] = i;
  }
  move_to_beginning(A, N, begin, size, 0);
  for (i = 0; i < size; ++i) {
    printf("%d ", A[i]);
  }
  printf("\n");
  return 0;
}
  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-11T13:02:38+00:00Added an answer on June 11, 2026 at 1:02 pm

    Case 1: Source overlaps with destination at most in a single contiguous region, which is smaller than whole array

    Detailed explanation of this case is given in the first answer by R.. I’ve nothing to add here.

    Case 2: Either source overlaps with destination in two contiguous regions or we rotate whole array

    The easiest approach would be always rotate whole array. This also moves some unneeded elements from destination range, but since in this case K > N/2, this does not make number of operations more then twice as necessary.

    To rotate the array, use cycle leader algorithm: take first element of the array (A[0]) and copy it to destination position; previous contents of this position move again to its proper position; continue until some element is moved to the starting position.

    Continue applying the cycle leader algorithm for next starting positions: A[1], A[2], …, A[GCD(N,d) – 1], where d is the distance between source and destination.

    After GCD(N,d) steps, all elements are on their proper positions. This works because:

    1. Positions 0, 1, …, GCD(N,d) – 1 belong to different cycles – because all these numbers are different (modulo GCD(N,d)).
    2. Each cycle has length N / GCD(N,d) – because d / GCD(N,d) and N / GCD(N,d) are relatively prime.

    This algorithm is simple and it moves each element exactly once. It may be made thread-safe (if we skip the write step unless inside the destination range). Other multi-threading-related advantage is that each element may have only two values – value before “move” and value after “move” (no temporary in-between values possible).

    But it does not always have optimal performance. If element_size * GCD(N,d) is comparable to cache line size, we might take all GCD(N,d) starting positions and process them together. If this value is too large, we can split starting positions into several contiguous segments to lower space requirements back to O(1).

    The problem is when element_size * GCD(N,d) is much smaller than cache line size. In this case we get a lot of cache misses and performance degrades. gusbro’s idea to temporarily swap array pieces with some “swap” region (of size d) suggests more efficient algorithm for this case. It may be optimized more if we use “swap” region, that fits in the cache, and copy non-overlapped areas with memcpy.


    One more algorithm. It does not overwrite elements that are not in the destination range. And it is cache-friendly. The only disadvantage is: it moves each element exactly twice.

    The idea is to move two pointers in opposite directions and swap pointed elements. There is no problem with overlapping regions because overlapping regions are just reversed. After first pass of this algorithm, we have all source elements moved to destination range, but in reversed order. So second pass should reverse destination range:

    for (d = dst_start, s = src_end - 1;
         d != dst_end;
         d = (d + 1) % N, s = (s + N - 1) % N)
      swap(s, d);
    
    for (d = dst_start, s = dst_end - 1;
         d != dst_end;
         d = (d + 1) % N, s = (s + N - 1) % N)
      swap(s, d);
    
    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have several services implementing a common interface and I want to be able
I have an interface with several events I have base class implementing the interface
I am implementing a memcached client library. I want it to support several servers
i have a little problem implementing some serialization/deserialization logic. I have several classes that
While implementing a FIFO I have used the following structure: struct Node { T
I have several questions regarding forms and PHP but if I should put them
I work on implementing IPv6 support for several applications, but I wondered what are
I'm implementing several classes which does not have data by itself, just logics. These
Imagine you have an application with several hundreds of classes implementing dozens of high
Implementing Ajax requests in my rails 3 app has been one of the most

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.