-
Type:
Bug
-
Resolution: Fixed
-
Priority:
Medium
-
Code Generation Tools
-
CODEGEN-15061
-
-
-
-
default
-
For below sequence of instructions, the C28 compiler fails to add an extra delay
slot required for instruction 'MOV32 ACC,R0H' when it is placed at the start of
a function. Also, the calling function must end with a 5p instruction (eg DIVF32)
that already requires all 4 cycles of the subsequent LCR call.
DIVF32 R0H,R1H,R0H ; 5p TMU instr requires 4 delay slot cycles
LCR c28xabi_ftoll ; 4 cycle branch
— above branches to below
MOV32 @ACC, R0H ; FPU to CPU requires inserting a prior NOP
Per TRM spruhs1c Section 1.4.2, above MOV32 from fpu register to c28x register
requires an extra delay slot prior to the MOV32 instruction.
Above issue is avoided by inserting a NOP (or another legal instruction)
before the LCR instruction, or before the MOV32 instruction.
This issue could impact any of below 5p TMU instructions (that require 4 delay
slots) if the instruction is the last one before an LCR (or LC) call to a
function that starts with a MOV32 from fpu register to c28x register:
DIVF32
SQRTF32
QUADF32
Details of original test case:
These commands show a source file and build it.
C:\examples>type file.c
#include <stdint.h>
int64_t fxn()
{
float x = -10.0;
return x / 0.1f;
}
C:\examples>cl2000 -v28 -ml -mt --cla_support=cla2 --float_support=fpu32 --tmu_support=tmu1 --vcu_support=vcrc -Ooff --fp_mode=relaxed --fp_reassoc=on --abi=eabi --src_interlist file.c
Here are 2 key lines from the compiler generated file.asm.
DIVF32 R0H,R1H,R0H ; [CPU_FPU] |6|
LCR #||__c28xabi_ftoll|| ; [CPU_ALU] |6|
__c28xabi_ftoll is a function from the compiler RTS library rts2800_fpu32_eabi.lib. Here is the beginning of the disassembly of this function.
00000000 __c28xabi_ftoll: 00000000 bfa9 MOV32 ACC, R0H
MOV32 reads R0H, the same register written by DIVF32. Not enough cycles have executed before the read of R0H.