[#EXT_EP-12253] If MC 'start_task' request to a spawner fails, MC believes the session started and blocks the user from starting train sessions

[EXT_EP-12253] If MC 'start_task' request to a spawner fails, MC believes the session started and blocks the user from starting train sessions Created: 21/Mar/25 Updated: 06/Aug/25 Resolved: 06/Aug/25
Status:	Fixed
Project:	Embedded Software & Tools
Component/s:	None
Affects Version/s:	None
Fix Version/s:	None

Type:

Bug

Priority:

High

Reporter:

TI User

Assignee:

TI User

Resolution:

Fixed

Votes:

Remaining Estimate:

Not Specified

Time Spent:

Not Specified

Original Estimate:

Not Specified

Product:

Edge AI Studio

Internal ID:

EDGEST-1332

Found In Release:

MC_1.3.1

Fix In Release:

MC_1.4.0

Affected Platform/Device:

None

Description

I believe this is quite rare, but if it happens the way to fix it is to restart the spawner.

using the admin API /api/admin/get_spawner_info shows he is having a train session but using 'docker ps' confirmed there is no session.

After restarting MC the state was fixed.

Looking at the dinfra logs it appears that for some reason the HTTP request from MC to spawner failed with

2025-03-18T19:04:34.873Z INFO DAEMON dev.ti.com/cluster1/dev-mcw2-1 default/modelcomposer 812337108 [

"[permId: 249772, projectId: fc7c5080, taskType: detection] /api/start_train: {\"errno\":-111,\"code\":\"ECONNREFUSED\",\"syscall\":\"connect\",\"address\":\"10.123.41.74\",\"port\":41087}"

]

The error showed to the user , but let the MC state broken.

I don't know why was there connection refused for the API call, but in cases like this the state of the MC should be correct.

An easy way to reproduce this and make sure it is fix, is just by changing the MC code to use random port, the error will be different but the state will be broken.

Generated at Sat Dec 13 11:37:43 CST 2025 using Jira 10.3.7#10030007-sha1:a563685562f94d165eb4e158cfb2a142338d8c54.

[EXT_EP-12253] If MC 'start_task' request to a spawner fails, MC believes the session started and blocks the user from starting train sessions Created: 21/Mar/25 Updated: 06/Aug/25 Resolved: 06/Aug/25