Matter Cluster commands don't send first time after reboot (CON-1439) #1176

pavel808 · 2024-11-28T11:02:44Z

I have a Matter controller application running on an ESP32-S3 which is communicating with Thread end devices which are running on ESP32-C6 devices.

I send the commands from the controller as follows :

cluster_command* command = new cluster_command(nodeID, endpoint_id, cluster_id, command_id, data, chip::NullOptional, command_send_success_cb, command_send_error_cb);		
err = command->send_command();

Whenever I reboot an end device and try to send a cluster command from the Matter controller application, it always fails on the fist or second attempt for some reason.

The logs are below from each device. After multiple attempts, the cluster command reaches the end device.

Why would this be?

From the Matter controller application :

I (767559) chip[DIS]: Found an existing secure session to [1:0000000000000600]!
I (767569) chip[EM]: <<< [E:52147i S:27014 M:105963599] (S) Msg TX to 1:0000000000000600 [FF07] [UDP:[FDB8:9D54:ECE0:1:1266:EECE:849E:C45A]:5540] --- Type 0001:08 (IM:InvokeCommandRequest) (B:65)
I (769099) chip[EM]: Retransmitting MessageCounter:105963599 on exchange 52147i Send Cnt 1
I (770339) OPENTHREAD: Received PIO
I (770339) OPENTHREAD: Received PIO
I (770339) OPENTHREAD: Received PIO
I (770689) chip[EM]: Retransmitting MessageCounter:105963599 on exchange 52147i Send Cnt 2
I (772909) chip[EM]: Retransmitting MessageCounter:105963599 on exchange 52147i Send Cnt 3
I (774439) OPENTHREAD: Received PIO
I (774439) OPENTHREAD: Received PIO
I (774439) OPENTHREAD: Received PIO
I (775879) chip[EM]: Retransmitting MessageCounter:105963599 on exchange 52147i Send Cnt 4
I (779679) OPENTHREAD: Received PIO
I (779679) OPENTHREAD: Received PIO
I (779679) OPENTHREAD: Received PIO
E (780569) chip[EM]: Failed to Send CHIP MessageCounter:105963599 on exchange 52147i sendCount: 4 max retries: 4
I (797099) chip[SC]: SecureSession[0x3c276934, LSID:27014]: State change 'kActive' --> 'kDefunct'
I (797099) chip[DMG]: Time out! failed to receive invoke command response from Exchange: 52147i
I (797099) device_controller: Cluster command Send failure: err :32
I (797319) OPENTHREAD: Received PIO
I (797319) OPENTHREAD: Received PIO
I (797319) OPENTHREAD: Received PIO
I (801289) OPENTHREAD: Received PIO
I (801289) OPENTHREAD: Received PIO
I (801289) OPENTHREAD: Received PIO

From the Thread end device application :

E (102942) chip[IN]: Data received on an unknown session (LSID=25254). Dropping it!
E (104482) chip[IN]: Data received on an unknown session (LSID=25254). Dropping it!
E (106072) chip[IN]: Data received on an unknown session (LSID=25254). Dropping it!
E (108282) chip[IN]: Data received on an unknown session (LSID=25254). Dropping it!
E (111252) chip[IN]: Data received on an unknown session (LSID=25254). Dropping it!

The text was updated successfully, but these errors were encountered:

shubhamdp · 2024-12-03T12:02:42Z

Why would this be?

This is valid in any scenario and not just thread device.

Device and Controller caches the session while communicating with each other. When device resets, it does not have that session with it, but controller uses the same cached session which is not known to the device at the moment and you see below error.

E (111252) chip[IN]: Data received on an unknown session (LSID=25254). Dropping it!

Controller does 5 attempts before evicting the session and using new one.

pavel808 · 2024-12-03T12:20:52Z

@shubhamdp Thanks for this information. I wasn't aware of this. Is there a way to solve this do you think? So that the controller straight away uses a new session? Or does less than 5 attempts.

shubhamdp · 2024-12-10T10:18:50Z

@pavel808 I did not test this but see if disabling CHIP_DEVICE_CONFIG_ENABLE_AUTOMATIC_CASE_RETRIES works for you.
https://github.com/shubhamdp/connectedhomeip/blob/9d604895004095da9d09bb62a78e5c4e8914637a/src/include/platform/CHIPDeviceConfig.h#L1541-L1548

pavel808 · 2024-12-10T18:02:52Z

@pavel808 I did not test this but see if disabling CHIP_DEVICE_CONFIG_ENABLE_AUTOMATIC_CASE_RETRIES works for you. https://github.com/shubhamdp/connectedhomeip/blob/9d604895004095da9d09bb62a78e5c4e8914637a/src/include/platform/CHIPDeviceConfig.h#L1541-L1548

@shubhamdp I tried setting this to 0, but still the same.

I tried directly changing it inside CHIPDeviceConfig.h though. I'm not sure if connectedhomeip changes are even being complied by using just doing idf.py build to build my application.

shubhamdp · 2024-12-18T06:34:49Z

Thats what I had suggested, changing directly to CHIPDeviceConfig.h, changes do need to reflect from connectedhomeip when using idf.py build.

Let me check if there's some other work-around to this.

xbrunosousa · 2025-01-10T15:10:44Z

@shubhamdp any news about this?

shubhamdp · 2025-01-13T11:42:46Z

Can you please try cherry-picking this PR: project-chip/connectedhomeip#36936 and test by re-building the controller

cd esp-matter
curl https://patch-diff.githubusercontent.com/raw/project-chip/connectedhomeip/pull/36936.patch | git -C connectedhomeip/connectedhomeip am

pavel808 · 2025-02-28T14:18:18Z

@shubhamdp Sorry, i'm only getting back to solving this issue now. I have tried commenting out the following in CHIPDeviceConfig.h but it hasn't solved it.

/**
 * CHIP_DEVICE_CONFIG_ENABLE_AUTOMATIC_CASE_RETRIES
 *
 * If 1, enable support for automatic CASE establishment retries.
 */
//#ifndef CHIP_DEVICE_CONFIG_ENABLE_AUTOMATIC_CASE_RETRIES
//#define CHIP_DEVICE_CONFIG_ENABLE_AUTOMATIC_CASE_RETRIES 1
//#endif
//
//#if CHIP_DEVICE_CONFIG_ENABLE_AUTOMATIC_CASE_RETRIES
//
///**
// * CHIP_DEVICE_CONFIG_AUTOMATIC_CASE_RETRY_INITIAL_DELAY_SECONDS
// *
// * The initial retry delay, in seconds, for our automatic CASE retries.
// */
//#ifndef CHIP_DEVICE_CONFIG_AUTOMATIC_CASE_RETRY_INITIAL_DELAY_SECONDS
//#define CHIP_DEVICE_CONFIG_AUTOMATIC_CASE_RETRY_INITIAL_DELAY_SECONDS 1
//#endif
//
///**
// * CHIP_DEVICE_CONFIG_AUTOMATIC_CASE_RETRY_MAX_BACKOFF
// *
// * The maximum number of times we back off, by a factor of 2 each time, from our
// * initial CASE retry interval before we plateau.
// */
//#ifndef CHIP_DEVICE_CONFIG_AUTOMATIC_CASE_RETRY_MAX_BACKOFF
//#define CHIP_DEVICE_CONFIG_AUTOMATIC_CASE_RETRY_MAX_BACKOFF 5
//#endif
//
//#endif // CHIP_DEVICE_CONFIG_ENABLE_AUTOMATIC_CASE_RETRIES

pavel808 · 2025-03-03T13:31:03Z

Can you please try cherry-picking this PR: project-chip/connectedhomeip#36936 and test by re-building the controller
cd esp-matter
curl https://patch-diff.githubusercontent.com/raw/project-chip/connectedhomeip/pull/36936.patch | git -C connectedhomeip/connectedhomeip am

@shubhamdp Unfortunately this doesn't work for me at all. I apply the patch as you suggest. Then I clean my application project and build it again doing idf.py build . The result is the same. Is what I am doing correct?

github-actions bot changed the title ~~Matter Cluster commands don't send first time after reboot~~ Matter Cluster commands don't send first time after reboot (CON-1439) Nov 28, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Matter Cluster commands don't send first time after reboot (CON-1439) #1176

Matter Cluster commands don't send first time after reboot (CON-1439) #1176

pavel808 commented Nov 28, 2024

shubhamdp commented Dec 3, 2024

pavel808 commented Dec 3, 2024

shubhamdp commented Dec 10, 2024

pavel808 commented Dec 10, 2024

shubhamdp commented Dec 18, 2024

xbrunosousa commented Jan 10, 2025

shubhamdp commented Jan 13, 2025

pavel808 commented Feb 28, 2025

pavel808 commented Mar 3, 2025

Matter Cluster commands don't send first time after reboot (CON-1439) #1176

Matter Cluster commands don't send first time after reboot (CON-1439) #1176

Comments

pavel808 commented Nov 28, 2024

shubhamdp commented Dec 3, 2024

pavel808 commented Dec 3, 2024

shubhamdp commented Dec 10, 2024

pavel808 commented Dec 10, 2024

shubhamdp commented Dec 18, 2024

xbrunosousa commented Jan 10, 2025

shubhamdp commented Jan 13, 2025

pavel808 commented Feb 28, 2025

pavel808 commented Mar 3, 2025