Using memset on memory location that is 8-byte aligned inefficiently writes to memory only 8-bits at a time

XMLWordPrintable

    • Type: Enhancement
    • Resolution: Won't Implement
    • Priority: Not Prioritized

      The attached files form a simple program that calls memset ...

      void fxn(long long *ptr, int length)
      {
         memset(ptr, 0, length);
      }
      

      Build it, then produce the disassembly with the C code interlisted ...

      % tiarmclang @options.txt -Oz  main.c file.c -Wl,-c -o main.out -Wl,-m=main.map
      
      % tiarmobjdump -dS main.out > main_dis.txt
      tiarmobjdump.exe: warning: 'main.out': failed to find source e:\cvs\jenkins\workspace\buildandvalidate_worker\llvm_cgt\arm-llvm\release\libc\src\boot_cortex_m.c
      tiarmobjdump.exe: warning: 'main.out': failed to find source E:/cvs/jenkins/workspace/BuildAndValidate_Worker/llvm_cgt/llvm-project/compiler-rt/lib/builtins/arm\aeabi_memset.S
      tiarmobjdump.exe: warning: 'main.out': failed to find source e:\cvs\jenkins\workspace\buildandvalidate_worker\llvm_cgt\arm-llvm\release\libc\src\pre_init.c
      tiarmobjdump.exe: warning: 'main.out': failed to find source e:\cvs\jenkins\workspace\buildandvalidate_worker\llvm_cgt\arm-llvm\release\libc\src\exit.c
      

      Inspect the disassembly to see the loop that sets memory to all zeros is ...

      00000062 <_loop>:
            62: 9a 42        	cmp	r2, r3
            64: 08 bf        	it	eq
            66: 70 47        	bxeq	lr
            68: c1 54        	strb	r1, [r0, r3]
            6a: 01 33        	adds	r3, #1
            6c: f9 e7        	b	0x62 <_loop>            @ imm = #-14
      

      This only writes 8-bits at a time.

      See the related forum thread for an alternate implementation of memset that writes 64-bits at a time. The customer's code uses memset to initialize 64MB of memory. Using the faster memset takes about 0.5 seconds, vs about 14 seconds for the slower memset.

            Assignee:
            TI User
            Reporter:
            TI User
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved:

                Connection: Intermediate to External PROD System
                EXTSYNC-3398 - Using memset on memory location tha...
                SYNCHRONIZED
                • Last Sync Date: