Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Matter Cluster commands don't send first time after reboot (CON-1439) #1176

Open
pavel808 opened this issue Nov 28, 2024 · 9 comments
Open

Comments

@pavel808
Copy link

I have a Matter controller application running on an ESP32-S3 which is communicating with Thread end devices which are running on ESP32-C6 devices.

I send the commands from the controller as follows :

cluster_command* command = new cluster_command(nodeID, endpoint_id, cluster_id, command_id, data, chip::NullOptional, command_send_success_cb, command_send_error_cb);		
err = command->send_command();

Whenever I reboot an end device and try to send a cluster command from the Matter controller application, it always fails on the fist or second attempt for some reason.

The logs are below from each device. After multiple attempts, the cluster command reaches the end device.

Why would this be?

From the Matter controller application :

I (767559) chip[DIS]: Found an existing secure session to [1:0000000000000600]!
I (767569) chip[EM]: <<< [E:52147i S:27014 M:105963599] (S) Msg TX to 1:0000000000000600 [FF07] [UDP:[FDB8:9D54:ECE0:1:1266:EECE:849E:C45A]:5540] --- Type 0001:08 (IM:InvokeCommandRequest) (B:65)
I (769099) chip[EM]: Retransmitting MessageCounter:105963599 on exchange 52147i Send Cnt 1
I (770339) OPENTHREAD: Received PIO
I (770339) OPENTHREAD: Received PIO
I (770339) OPENTHREAD: Received PIO
I (770689) chip[EM]: Retransmitting MessageCounter:105963599 on exchange 52147i Send Cnt 2
I (772909) chip[EM]: Retransmitting MessageCounter:105963599 on exchange 52147i Send Cnt 3
I (774439) OPENTHREAD: Received PIO
I (774439) OPENTHREAD: Received PIO
I (774439) OPENTHREAD: Received PIO
I (775879) chip[EM]: Retransmitting MessageCounter:105963599 on exchange 52147i Send Cnt 4
I (779679) OPENTHREAD: Received PIO
I (779679) OPENTHREAD: Received PIO
I (779679) OPENTHREAD: Received PIO
E (780569) chip[EM]: Failed to Send CHIP MessageCounter:105963599 on exchange 52147i sendCount: 4 max retries: 4
I (797099) chip[SC]: SecureSession[0x3c276934, LSID:27014]: State change 'kActive' --> 'kDefunct'
I (797099) chip[DMG]: Time out! failed to receive invoke command response from Exchange: 52147i
I (797099) device_controller: Cluster command Send failure: err :32
I (797319) OPENTHREAD: Received PIO
I (797319) OPENTHREAD: Received PIO
I (797319) OPENTHREAD: Received PIO
I (801289) OPENTHREAD: Received PIO
I (801289) OPENTHREAD: Received PIO
I (801289) OPENTHREAD: Received PIO

From the Thread end device application :

E (102942) chip[IN]: Data received on an unknown session (LSID=25254). Dropping it!
E (104482) chip[IN]: Data received on an unknown session (LSID=25254). Dropping it!
E (106072) chip[IN]: Data received on an unknown session (LSID=25254). Dropping it!
E (108282) chip[IN]: Data received on an unknown session (LSID=25254). Dropping it!
E (111252) chip[IN]: Data received on an unknown session (LSID=25254). Dropping it!

@github-actions github-actions bot changed the title Matter Cluster commands don't send first time after reboot Matter Cluster commands don't send first time after reboot (CON-1439) Nov 28, 2024
@shubhamdp
Copy link
Contributor

Why would this be?

This is valid in any scenario and not just thread device.

Device and Controller caches the session while communicating with each other. When device resets, it does not have that session with it, but controller uses the same cached session which is not known to the device at the moment and you see below error.

E (111252) chip[IN]: Data received on an unknown session (LSID=25254). Dropping it!

Controller does 5 attempts before evicting the session and using new one.

@pavel808
Copy link
Author

pavel808 commented Dec 3, 2024

@shubhamdp Thanks for this information. I wasn't aware of this. Is there a way to solve this do you think? So that the controller straight away uses a new session? Or does less than 5 attempts.

@shubhamdp
Copy link
Contributor

@pavel808 I did not test this but see if disabling CHIP_DEVICE_CONFIG_ENABLE_AUTOMATIC_CASE_RETRIES works for you.
https://github.com/shubhamdp/connectedhomeip/blob/9d604895004095da9d09bb62a78e5c4e8914637a/src/include/platform/CHIPDeviceConfig.h#L1541-L1548

@pavel808
Copy link
Author

@pavel808 I did not test this but see if disabling CHIP_DEVICE_CONFIG_ENABLE_AUTOMATIC_CASE_RETRIES works for you. https://github.com/shubhamdp/connectedhomeip/blob/9d604895004095da9d09bb62a78e5c4e8914637a/src/include/platform/CHIPDeviceConfig.h#L1541-L1548

@shubhamdp I tried setting this to 0, but still the same.

I tried directly changing it inside CHIPDeviceConfig.h though. I'm not sure if connectedhomeip changes are even being complied by using just doing idf.py build to build my application.

@shubhamdp
Copy link
Contributor

Thats what I had suggested, changing directly to CHIPDeviceConfig.h, changes do need to reflect from connectedhomeip when using idf.py build.

Let me check if there's some other work-around to this.

@xbrunosousa
Copy link

@shubhamdp any news about this?

@shubhamdp
Copy link
Contributor

Can you please try cherry-picking this PR: project-chip/connectedhomeip#36936 and test by re-building the controller

cd esp-matter
curl https://patch-diff.githubusercontent.com/raw/project-chip/connectedhomeip/pull/36936.patch | git -C connectedhomeip/connectedhomeip am

@pavel808
Copy link
Author

@shubhamdp Sorry, i'm only getting back to solving this issue now. I have tried commenting out the following in CHIPDeviceConfig.h but it hasn't solved it.

/**
 * CHIP_DEVICE_CONFIG_ENABLE_AUTOMATIC_CASE_RETRIES
 *
 * If 1, enable support for automatic CASE establishment retries.
 */
//#ifndef CHIP_DEVICE_CONFIG_ENABLE_AUTOMATIC_CASE_RETRIES
//#define CHIP_DEVICE_CONFIG_ENABLE_AUTOMATIC_CASE_RETRIES 1
//#endif
//
//#if CHIP_DEVICE_CONFIG_ENABLE_AUTOMATIC_CASE_RETRIES
//
///**
// * CHIP_DEVICE_CONFIG_AUTOMATIC_CASE_RETRY_INITIAL_DELAY_SECONDS
// *
// * The initial retry delay, in seconds, for our automatic CASE retries.
// */
//#ifndef CHIP_DEVICE_CONFIG_AUTOMATIC_CASE_RETRY_INITIAL_DELAY_SECONDS
//#define CHIP_DEVICE_CONFIG_AUTOMATIC_CASE_RETRY_INITIAL_DELAY_SECONDS 1
//#endif
//
///**
// * CHIP_DEVICE_CONFIG_AUTOMATIC_CASE_RETRY_MAX_BACKOFF
// *
// * The maximum number of times we back off, by a factor of 2 each time, from our
// * initial CASE retry interval before we plateau.
// */
//#ifndef CHIP_DEVICE_CONFIG_AUTOMATIC_CASE_RETRY_MAX_BACKOFF
//#define CHIP_DEVICE_CONFIG_AUTOMATIC_CASE_RETRY_MAX_BACKOFF 5
//#endif
//
//#endif // CHIP_DEVICE_CONFIG_ENABLE_AUTOMATIC_CASE_RETRIES

@pavel808
Copy link
Author

pavel808 commented Mar 3, 2025

Can you please try cherry-picking this PR: project-chip/connectedhomeip#36936 and test by re-building the controller

cd esp-matter
curl https://patch-diff.githubusercontent.com/raw/project-chip/connectedhomeip/pull/36936.patch | git -C connectedhomeip/connectedhomeip am

@shubhamdp Unfortunately this doesn't work for me at all. I apply the patch as you suggest. Then I clean my application project and build it again doing idf.py build . The result is the same. Is what I am doing correct?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants