-
Type:
Bug
-
Resolution: Fixed
-
Priority:
Medium
-
Linux Core SDK
-
LCPD-45488
-
11.01
-
11.02
-
am64xx-hsevm
PRP not working as per ABB.
Verify the PRP functionality
PRP not working need to send patch to make it work.
Full Analysis
==========
Hi Parvathi and CouthIT team,
I was doing some more testing on PRP and encountered an issue. The issue seems to have been introduced by your HSR/ PRP series. Specifically the commit https://git.ti.com/cgit/ti-linux-kernel/ti-linux-kernel/commit/net/hsr?h=ti-linux-6.12.y&id=aa070479b732b07ca2671cceb4cd243e5b80ebb4
| 683 | if (!hsr_xmit(skb, port, frame)) | 691 | if (skip_tx_duplicate != 0) { |
| 692 | skip_tx_duplicate = hsr_xmit(skb, port, frame); | ||
| 684 | if (port->type == HSR_PT_SLAVE_A ||693 | if (port->type == HSR_PT_SLAVE_A || | |
|---|---|---|---|
| 685 | port->type == HSR_PT_SLAVE_B) | 694 | port->type == HSR_PT_SLAVE_B) |
| 686 | sent = true; | 695 | sent = true; |
This above diff seems to be the issue. Earlier we used to check return type of hsr_xmit() and if that fails, sent is not set to true. The variable sent get set to true only if hsr_xmit returns 0 i.e. succeeds.
With your patch, return type of hsr_xmit() is stored in skip_tx_duplicate variable. However this variable is not checked before setting sent = true and as a result even if hsr_xmit() fails sent gets set to true.
In API hsr_forward_do()
/* If hardware duplicate generation is enabled, only send out
* one port.
*/
if ((port->dev->features & NETIF_F_HW_HSR_DUP) && sent)
continue;
Now this code checks if duplication offload is enabled and sent is true - the stack doesn't send the data on the second port. This code is executed for each slave port within the HSR interface. The code first gets called for SLAVE_A. Duplication offload is enabled but at this point sent is not set so the code will go on and hsr_xmit() will get called. Now assume for some reason hsr_xmit() fails, your code will still set sent = true. Now when the same code gets called for SLAVE_B port, sent = true and duplication is also enabled so the loop will just terminate. The stack will assume packets has already been sent on one port and will not try to send on the other port. But in reality first tx failed. So the packet wasn't transmitted and second tx will not be done. As a result ping will break.
This is fairly easy to reproduce, Just determine which port is SLAVE_A and remove the cable to that port on the sender DUT. The current code only tries to send packet on SLAVE_A.
The fix for this is to check return value of hsr_xmit() as earlier.
This issue is not seen in upstream as your series is not part of upstream. This is only seen on TI kernel where HSR / PRP series is merged. I will send this fix to ti kernel as this is breaking PRP functionality on AM64x.
Please inspect this issue and see how was this code working for you. Ideally if SLAVE_A fails to TX for some reason, stack will not try sending packet on SLAVE_B and as a result packet will not be transmitted at all.
–
Fix merged to tiL6.12 https://git.ti.com/cgit/ti-linux-kernel/ti-linux-kernel/commit/?h=ti-linux-6.12.y-cicd&id=52903366603b484179636ab0ef0ca4605b307286