intrinsic memcmp

6

3

According to the gcc docs, memcmp is not an intrinsic function of GCC. If you wanted to speed up glibc's memcmp under gcc, you would need to use the lower level intrinsics defined in the docs. However, when searching around the internet, it seems that many people have the impression that memcmp is a builtin function. Is it for some compilers and not for others?

Justin

Posted 2009-05-13T03:18:12.847

Reputation: 107

2Check out memcmp from glibc too. The relationship between GCC and glibc is quite complicated, with both providing different versions of the same functions and then sometimes fighting it out in header files whose definition will get in fact used in user programs. – Laurynas Biveinis – 2009-05-13T19:31:50.717

Answers

5

Your link appears to be for the x86 architecture-specific built-in functions, according to this memcmp is implemented as an architecture-independent built-in by gcc.

Edit:

Compiling the following code with Cygwin gcc version 3.3.1 for i686, -O2:

#include <stdlib.h>

struct foo {
    int a;
    int b;
} ;

int func(struct foo *x, struct foo *y)
{
    return memcmp(x, y, sizeof (struct foo));
}

Produces the following output (note that the call to memcmp() is converted to an 8-byte "repz cmpsb"):

   0:   55                      push   %ebp
   1:   b9 08 00 00 00          mov    $0x8,%ecx
   6:   89 e5                   mov    %esp,%ebp
   8:   fc                      cld    
   9:   83 ec 08                sub    $0x8,%esp
   c:   89 34 24                mov    %esi,(%esp)
   f:   8b 75 08                mov    0x8(%ebp),%esi
  12:   89 7c 24 04             mov    %edi,0x4(%esp)
  16:   8b 7d 0c                mov    0xc(%ebp),%edi
  19:   f3 a6                   repz cmpsb %es:(%edi),%ds:(%esi)
  1b:   0f 92 c0                setb   %al
  1e:   8b 34 24                mov    (%esp),%esi
  21:   8b 7c 24 04             mov    0x4(%esp),%edi
  25:   0f 97 c2                seta   %dl
  28:   89 ec                   mov    %ebp,%esp
  2a:   5d                      pop    %ebp
  2b:   28 c2                   sub    %al,%dl
  2d:   0f be c2                movsbl %dl,%eax
  30:   c3                      ret    
  31:   90                      nop    

Lance Richardson

Posted 2009-05-13T03:18:12.847

Reputation: 4 099

yes, they are. There is a specific SIMD intrinsic set for doing that (in SSE4.2 if I recall correctly). – None – 2010-05-10T18:33:40.880

3Why not a 2-word repz cmpsl? Or better yet, simply if (x-&gt;a == y-&gt;a &amp;&amp; x-&gt;b == y-&gt;b)? gcc sucks... – R.. – 2010-08-13T07:22:50.697

repz cmpsl won't give you the right answer for memcmp. – msandiford – 2010-08-24T23:29:51.117

Interesting. Any idea if these are optimized in architecture specific ways? – Justin – 2009-05-13T03:51:49.940

8

Note that the repz cmpsb routine might not be faster than glibc's memcmp. In my tests, in fact, it's never faster, even when comparing just a few bytes.

See http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43052

Justin L.

Posted 2009-05-13T03:18:12.847

Reputation: 2 546

Truly, on 4.8 memcmp compiles to a jmp memcmp with -O3, not straight assembly. Can anyone summarize how can the glibc be faster? Something to do with alignment? – Ciro Santilli 新疆改造中心 六四事件 法轮功 – 2015-04-24T20:35:04.590

1

@CiroSantilli六四事件法轮功包卓轩 A typical "fast memcmp" compares byte-for-byte until an alignment boundary is reached, then compares word-for-word (typically the native word size, 64-bit on x86-64), then if any less-than-word-size bytes remain they are compared byte-for-byte until finished. Special-case code for quickly checking small sizes are also common. See glibc source: https://sourceware.org/git/?p=glibc.git;a=blob;f=string/memcmp.c

– Jody Lee Bruchon – 2016-01-03T05:48:13.913

1+1 for a great link! I was just searching for an answer myself as to why the libc memcmp() performed orders of magnitude faster than a simple repz cmpsb. I guessed it had something to do with alignment, now I know:) – Unsigned – 2011-09-06T22:34:42.357

0

Now in 2017, GCC and Clang seems to have some optimizations for buffers of sizes 1, 2, 4, 8 and some others, for example 3, 5 and multiple of 8.

Nick

Posted 2009-05-13T03:18:12.847

Reputation: 4 332