Using --opt_level=0 with intrinsics like __complex_multiply generates incorrect code

XMLWordPrintable

    • Type: Bug
    • Resolution: Fixed
    • Priority: Medium
    • Code Generation Tools
    • CODEGEN-15223
    • Hide
      C7000_2.1.0.LTS
      C7000_5.0.0.LTS
      C7000_3.1.0.LTS
      C7000_4.1.0.LTS
      Show
      C7000_2.1.0.LTS C7000_5.0.0.LTS C7000_3.1.0.LTS C7000_4.1.0.LTS
    • Hide
      C7000_6.x.0.LTS*
      C7000_4.1.3.LTS*
      C7000_5.0.2.LTS*
      Show
      C7000_6.x.0.LTS* C7000_4.1.3.LTS* C7000_5.0.2.LTS*
    • default
    • Hide
      This bug only occurs at --opt_level=0, an easy workaround is raising the --opt_level.

      Workaround 2:
      You can also separate the compound assignment and the post-increment into two statements.

      Instead of :
      *pyc2++ += __complex_multiply(*pxc2++, *pwc2++);

      Use:
      *pyc2 += __complex_multiply(*pxc2, *pwc2);
      pyc2++; pxc2++; pwc2++;

      Workaround 3:
      You can also expand the compound assignment explicitly into a separate + and =.

      Instead of:
      *pyc2++ += __complex_multiply(*pxc2++, *pwc2++);

      Use:
      *pyc2 = *pyc2 + __complex_multiply(*pxc2, *pwc2);
      pyc2++; pxc2++; pwc2++;
      Show
      This bug only occurs at --opt_level=0, an easy workaround is raising the --opt_level. Workaround 2: You can also separate the compound assignment and the post-increment into two statements. Instead of : *pyc2++ += __complex_multiply(*pxc2++, *pwc2++); Use: *pyc2 += __complex_multiply(*pxc2, *pwc2); pyc2++; pxc2++; pwc2++; Workaround 3: You can also expand the compound assignment explicitly into a separate + and =. Instead of: *pyc2++ += __complex_multiply(*pxc2++, *pwc2++); Use: *pyc2 = *pyc2 + __complex_multiply(*pxc2, *pwc2); pyc2++; pxc2++; pwc2++;

      The attached test case has these lines ...

          __cfloat4  *restrict pxc4 = (__cfloat4*)inFloat;
          __cfloat4  *restrict pyc4 = (__cfloat4*)outFloat;
          __cfloat4  *restrict pwc4 = (__cfloat4*)inwFloat;
          for (k = 0; k < MSIZE; k+=4)
          {
              *pyc4++ += __complex_multiply(*pxc4++, *pwc4++);
          }
      

      When built with --opt_level=off, the first 4 results are: 1.0, 3.0, 1,0, 3.0. When built with --opt_level=0, the first 4 results are: 1.0, 1.0, 1.0, 1.0.

            Assignee:
            TI User
            Reporter:
            TI User
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved:

                Connection: Intermediate to External PROD System
                EXTSYNC-6445 - Using --opt_level=0 with intrinsics...
                SYNCHRONIZED
                • Last Sync Date: