Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

I2C hanging #30

Closed
jvoermans opened this issue Dec 4, 2020 · 19 comments
Closed

I2C hanging #30

jvoermans opened this issue Dec 4, 2020 · 19 comments

Comments

@jvoermans
Copy link
Owner

@jerabaul29 We probs need a safety net in case an I2C sensor has a defect and causes the arduino to hang. It is something in the Wire library, but there hasn't really been a solid solution to this:
arduino/Arduino#1476

It doesn't always occur. If the complete sensor is gone, or the data and/or clock line are gone it just simply cannot find the sensor. The problem actually occurs when the VCC and/or GND lines are disconnected. In general, this is unlikely to happen. If the sensor is broken, it is most likely that the sensor won't respond at all, so that is not an issue. If the sensor is ripped off completely, that is also not an issue as all lines will be disconnected. Nevertheless, sounds a bit uncomfortable to rely on simply 'unlikely to happen'.

So I guess there are two approaches to prevent the arduino from potential freezing:

  1. initiate regularly if the sensor is present. If not present, skip this sensor and go to the next. If a sensor fails while measuring and freezes the arduino, this is likely to happen only once and the watchdog can just reset the whole thing.
  2. some kind of 'sub-watchdog' which, rather than resetting the arduino, cancels the task of the sensor reading and continues the loop.

Regarding the first, the following library works quite well to identify if a sensor is present, however, problem remains when disconnecting only GND and/or VCC. It is an alternative library to the Wire library, but can be cut down to just the function that is needed to check if sensor is present (like an I2C scanner).
https://github.com/DSSCircuits/I2C-Master-Library

I initially though the watchdog could be used to 'break' a loop rather than resetting the arduino, but apparently this is not possible (break needs to be within a loop and watchdog works parallel not in series as far as I understand?).

Do you have any suggestions? Also, what I2C clock speed are we using? I think it is best to drop it down to say 50-100kHz or even lower if possible. I2C doesn't like long cables, but general advice is that it can still work on longer cables when clock speed is reduced. I tried the Multispeed I2C Scanner with the temperature sensor and it is able to go over a 5m long cat6 cable at 400kHz without problems, but perhaps better to drop it down as much as possible to prevent problems with signal glitches and thus potential freezing....

@jerabaul29
Copy link
Collaborator

Have you actually encountered this I2C problem in practice? (just being curious :) ).

It seems that there is an easy fix: the issue you link to has actually been closed in favor of issue arduino/ArduinoCore-avr#42 which implemented a fix. For backwards compatibility reasons or something like this it is not enabled by default I think (see discussion arduino/reference-en#895 ). I will add it to the program to "robustify" things :) .

I will change the I2C frequency then, to be on the safe side :) .

@jerabaul29
Copy link
Collaborator

I have just added both I2C timeout and reduced the I2C clock frequency:

Wire.setClock(i2c_clock_frequency);
Wire.setTimeout(i2c_timeout_micro_seconds);

which are set:

constexpr unsigned long i2c_timeout_micro_seconds = 1000UL;
constexpr unsigned long i2c_clock_frequency = 50000UL; // I think the default is 100000UL; may need to test by hand which values work

  • I will be interested to know if the timeout solves some problems you may have had

  • I am not sure that all I2C clock frequencies can be used. You may need to test by hand a bit by changing the value of:

constexpr unsigned long i2c_clock_frequency = 50000UL; // I think the default is 100000UL; may need to test by hand which values work

I think you may try 50kHz as it is now, maybe try down to 25kHz if you want, and check if it works :) .

Let me know if you have any problems.

@jvoermans
Copy link
Owner Author

jvoermans commented Dec 9, 2020

Amazing. I've spend 3 days looking for a solution. I was almost going to sketch a weird watchdog kinda way to circumvent this. Anyway, thanks for looking up.

I tested one I2C temperature probe over a 5m long cat6 cable, it still has issues. Instead I copy-pasted this Wire library:
https://github.com/arduino/ArduinoCore-avr/tree/master/libraries/Wire

and used the function:
Wire.setWireTimeout(i2c_timeout_micro_seconds , false)

That seems to work.
You are right, it is unlikely to happen. But I'm happy you found a solution, as it could be disastrous when it happens...
Once I have a completely build Geo prototype, I'll test again, but for now I think this is good :)

@jvoermans
Copy link
Owner Author

UPDATE: Ok interesting, I tested now with three temperature probes. When I decouple one temperature probe partially (disconnect GND and VCC together), it doesn't hang now (which is great) but it does block all other probes from transmitting values. To prevent that from happening I have to reset the multiplexer:

if (Wire.getWireTimeoutFlag()==1) {
digitalWrite(5, LOW);
Wire.clearWireTimeoutFlag();
digitalWrite(5, HIGH);
}

I attached reset pin to digital 5; getWireTimeoutFlag=1 when there is a timeout. Not sure how this is going to function in the Geo sketch though?

@jerabaul29
Copy link
Collaborator

That sounds good! :) Happy that at least some of the problems are solved.

I am a bit confused, just to make things clear:

  • have you tried with the long cable and the reduced SPI clock speed? Have you tried with several speed values (10000, 20000, 25000, 50000, 100000)? Are there some values where it works vs does not work? There may be a bit of black magics here, with some values that actually are rejected by the logics and do not update the true clock speed (I think that the hardware only accepts some clock values, but not sure which). So you may need to try and experiment.

  • regarding hanging etc, did you test with the Due sketch, or did you try with your own sketch? Did you find this snippet of code somewhere with extra explanations / do you have some sources with more explanations about it? :) Once I understand how things should be set up I can easily adapt the Due sketch - just not sure yet of what the mechanics there are :) I would have guessed that the Due sketch is a bit smart already and should be able to read from only some of the probes, but may be wrong. If you test with the Due, can you show some terminal output with the debug output enabled? :)

@jvoermans
Copy link
Owner Author

jvoermans commented Dec 9, 2020

  1. I ran a multi speed scanner sketch, it checks response at 50 - 400 kHz at different intervals. It connects over a 5.5m ethernet cable fine on all speeds. Best to just use 50 kHz.

  2. I tried my own sketch, as I don't have enough material to attach everything right now. Got the few lines from here: Document Wire timeout API on website arduino/reference-en#895. I'll try to test early next week with the Due, then I'll be in the lab again!

@jerabaul29
Copy link
Collaborator

Sounds good :) .

Ok, I will give it a bit of thinking also when I have time :) .

@jerabaul29
Copy link
Collaborator

(I had a good look at the tmp sensors logics; I would believe that there is not reason for not being able to read some temperature sensors if one of them fails :) will look forward to hearing the results of your testing :) )

@jvoermans
Copy link
Owner Author

I tested the sketch with three temperature probes (no geophone). Disconnecting the clock and data line is fine, it just gives an extreme value:
TMP,26.39,26.50,-891647819776.00,
Once reconnected, it gives a normal value again.

However, disconnecting either GND or VCC gives errors. Based on serial monitor, it doesn't seem to hang, but it might slow down the writing. Parser gives an error regarding 'wrapping'. I added the data here:
https://github.com/jvoermans/Vibration_Logger/tree/master/material_Jean/BinarySdDataParser/all_example_data/example_data_I2C_disconnect

Also, reconnecting GND or VCC doesn't reconnect. Anyway, disconnecting of GND or VCC alone is of course highly unlikely to happen...

@jerabaul29
Copy link
Collaborator

jerabaul29 commented Dec 14, 2020

I think it was "just" a problem that indeed it takes a bit more time in this second case to ignore the sensors, and the ADC buffers need to be bigger to accomodate that. I just increased the Arduino buffer size - can you update the Arduino Due code you run, try again, and let me know if this fixes things? :)

@jvoermans
Copy link
Owner Author

jvoermans commented Dec 14, 2020

Just added the new files. It is fast, but it produces a lot of zeros (in serial monitor at least) now so there seems to be something wrong?
https://github.com/jvoermans/Vibration_Logger/tree/master/material_Jean/BinarySdDataParser/all_example_data/example_data_I2C_disconnect2

@jerabaul29
Copy link
Collaborator

I found a couple of bugs in the Due code; can you upgrade to the latest due code version, and try again? :) Sorry, a bit hard to find bugs when I do not have the sensors myself.

@jvoermans
Copy link
Owner Author

Thanks! Yes, not sure how you do it at all without the hardware ;)

@jerabaul29
Copy link
Collaborator

It is actually a good exercise in software engineering ^^ .

@jerabaul29
Copy link
Collaborator

Will be curious to know if it fixed things :)

@jvoermans
Copy link
Owner Author

@jerabaul29 Just uploaded and tested. See example data here:
https://github.com/jvoermans/Vibration_Logger/tree/master/material_Jean/BinarySdDataParser/all_example_data/example_data_I2C_disconnect2

Zero's are gone, but seems to be a delay again when I disconnect the VCC line of one of the probes. Parser gives error for this file as well (either 6 or 7, it might be both). Files 1-3 are without disconnecting an I2C sensor.

@jerabaul29
Copy link
Collaborator

@jvoermans a possible way to improve I2C range is to decrease frequency.

This was discussed a bit higher in this thread. Can you check if it helps to change:

constexpr unsigned long i2c_clock_frequency = 50000UL; // I think the default is 100000UL; may need to test by hand which values work

Can you try with the following values?

 constexpr unsigned long i2c_clock_frequency = 10000UL;  // I think the default is 100000UL; may need to test by hand which values work 
 constexpr unsigned long i2c_clock_frequency = 15000UL;  // I think the default is 100000UL; may need to test by hand which values work 
 constexpr unsigned long i2c_clock_frequency = 25000UL;  // I think the default is 100000UL; may need to test by hand which values work 

and see if it helps?

@jerabaul29
Copy link
Collaborator

Regarding the hanging of the I2C: I searched for more but I did not find more information. From what I saw on your file, the logger just misses a few seconds of logging just when the sensor fails, and then the data are back. So I think it is still quite ok, in worst case we only loose a few seconds of data.

@jvoermans
Copy link
Owner Author

Agreed :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants