Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 6544403
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 25, 20262026-05-25T11:27:41+00:00 2026-05-25T11:27:41+00:00

I am reading the source code for glibc2.9 . Reading the source code for

  • 0

I am reading the source code for glibc2.9. Reading the source code for the strcpy function, the performance is not as good as I expect.

The following is the source code of strcpy in glibc2.9:

   char * strcpy (char *dest, const char* src)
    {
        reg_char c;
        char *__unbounded s = (char *__unbounded) CHECK_BOUNDS_LOW (src);
        const ptrdiff_t off = CHECK_BOUNDS_LOW (dest) - s - 1;
        size_t n;

        do {
            c = *s++;
            s[off] = c;
        }
        while (c != '\0');

        n = s - src;
        (void) CHECK_BOUNDS_HIGH (src + n);
        (void) CHECK_BOUNDS_HIGH (dest + n);

        return dest;
    }

Because I don’t know the reason for using the offset, I did some performance tests by comparing the above code with the following code:

char* my_strcpy(char *dest, const char *src)
{
    char *d = dest;
    register char c;

    do {
        c = *src++;
        *d++ = c;
    } while ('\0' != c);

    return dest;
}

As a result, the performance of strcpy is worse during my tests. I have removed the codes about bound pointer.

Why does the glibc version use the offsets??

The following is the introduction about the tests.

  1. platform: x86(Intel(R) Pentium(R) 4), gcc version 4.4.2
  2. compile flag: No flags, because I don’t want any optimisation; The command is gcc test.c.

The test code I used is the following:

#include <stdio.h>
#include <stdlib.h>

char* my_strcpy1(char *dest, const char *src)
{
    char *d = dest;
    register char c;

    do {
        c = *src++;
        *d++ = c;
    } while ('\0' != c);

    return dest;
}

/* Copy SRC to DEST. */
char *
my_strcpy2 (dest, src)
     char *dest;
     const char *src;
{
  register char c;
  char * s = (char *)src;
  const int off = dest - s - 1;

  do
    {
      c = *s++;
      s[off] = c;
    }
  while (c != '\0');

  return dest;
}

int main()
{
    const char str1[] = "test1";
    const char str2[] = "test2";
    char buf[100];

    int i;
    for (i = 0; i < 10000000; ++i) {
        my_strcpy1(buf, str1);
        my_strcpy1(buf, str2);
    }

    return 0;
}

When using the my_strcpy1 function, the outputs are:

[root@Lnx99 test]#time ./a.out

real    0m0.519s
user    0m0.517s
sys     0m0.001s
[root@Lnx99 test]#time ./a.out

real    0m0.520s
user    0m0.520s
sys     0m0.001s
[root@Lnx99 test]#time ./a.out

real    0m0.519s
user    0m0.516s
sys     0m0.002s

When useing my_strcpy2, the output is:

[root@Lnx99 test]#time ./a.out

real    0m0.647s
user    0m0.647s
sys     0m0.000s
[root@Lnx99 test]#time ./a.out

real    0m0.642s
user    0m0.638s
sys     0m0.001s
[root@Lnx99 test]#time ./a.out

real    0m0.639s
user    0m0.638s
sys     0m0.002s

I know it is not very accurate with the command time. But I could get the answer from the user time.

Update:

To remove the cost used to calculate the offset, I removed some code and added a global variable.

#include <stdio.h>
#include <stdlib.h>

char* my_strcpy1(char *dest, const char *src)
{
    char *d = dest;
    register char c;

    do {
        c = *src++;
        *d++ = c;
    } while ('\0' != c);

    return dest;
}


int off;

/* Copy SRC to DEST. */
char *
my_strcpy2 (dest, src)
     char *dest;
     const char *src;
{
  register char c;
  char * s = (char *)src;

  do
    {
      c = *s++;
      s[off] = c;
    }
  while (c != '\0');

  return dest;
}

int main()
{
    const char str1[] = "test1test1test1test1test1test1test1test1";
    char buf[100];

    off = buf-str1-1;

    int i;
    for (i = 0; i < 10000000; ++i) {
        my_strcpy2(buf, str1);
    }

    return 0;
}

But the performance of my_strcpy2 is still worse than my_strcpy1. Then I checked the assembled code but failed to get the answer too.

I also enlarged the size of string and the performance of my_strcpy1 is still better than my_strcpy2

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-25T11:27:41+00:00Added an answer on May 25, 2026 at 11:27 am

    Based on what I’m seeing, I’m not at all surprised that your code is faster.

    Look at the loop, both your loop and glibc’s loop are virtually identical. But glibc’s has a extra code before and after…

    In general, simple offsets do not slow down performance because x86 allows a fairly complicated indirect-addressing scheme. So both loops here will probably run at identical speeds.

    EDIT: Here’s my update with the added info you gave.

    Your string size is only 5 characters. Even though the offset method “may” be slightly faster in the long run, the fact that it needs several operations to compute the offset before starting the loop is slowing it down for short strings. Perhaps if you tried larger strings the gap will narrow and possibly vanish altogether.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I'm reading KVM source code and confronted with x86_64 inline assembly. In the following
I'm reading the source code of an open-source project and I've encountered the following
Whilst reading through the DirectWrite source code I came across the following struct: ///
I'm reading source code of nginx and find it's not initializing many of the
Reading source code of sample projects, such as Beast and Bort, are recommended as
When reading source code, I always want to know the full path of the
When I reading source code of Beast, I found a lot of code like
When reading some FreeBSD source code (See: radix.h lines 158-173), I found variable declarations
I am a beginner in C. While reading git's source code, I found this
I'm reading this C++ open source code and I came to a constructor but

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.