Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Watchdog is not disabled in bootloader #794

Open
mrmorawski opened this issue Apr 3, 2023 · 3 comments
Open

[BUG] Watchdog is not disabled in bootloader #794

mrmorawski opened this issue Apr 3, 2023 · 3 comments
Labels
bug Something isn't working

Comments

@mrmorawski
Copy link

Describe the bug
When operating the OAK-D PoE devices over unreliable connections, the docs recommend setting environment variables to modify the XLink watchdog timeout. However, these variables only modify the watchdog timeouts DeviceBase.cpp, but not in DeviceBootloader.cpp.

When one connects to a device over XLink, it flashes a bootloader anyway. If a device connection has high latency, it'll fail during that part of the connection regardless of whether the DEPTHAI_WATCHDOG environment was set to a more conservative value than the default.

Minimal Reproducible Example
Just connect to a device via a high-latency connection (e.g. over 4G), and try to flash a bootloader. It'll fail with a connection timeout, even if you set the DEPTHAI_WATCHDOG environment variable to 0.

   import depthai as dai

    try:                     
        device_info = dai.DeviceInfo(                                                     
            name=camera_ip,                                                               
            mxid="",                                                                      
            state=dai.XLinkDeviceState.X_LINK_FLASH_BOOTED,                               
            protocol=dai.XLinkProtocol.X_LINK_TCP_IP,                                     
            platform=dai.XLinkPlatform.X_LINK_MYRIAD_X,                                   
            status=dai.XLinkError_t.X_LINK_SUCCESS,                                       
        )                                                                                 
        bootloader = dai.DeviceBootloader(device_info)                                    
    except RuntimeError:                                                                  
        device_info = dai.DeviceInfo(                                                     
            name=camera_ip,                                                               
            mxid="",                                                                      
            state=dai.XLinkDeviceState.X_LINK_BOOTLOADER,                                 
            protocol=dai.XLinkProtocol.X_LINK_TCP_IP,                                     
            platform=dai.XLinkPlatform.X_LINK_MYRIAD_X,                                   
            status=dai.XLinkError_t.X_LINK_SUCCESS,                                       
        )                                                                                 
        bootloader = dai.DeviceBootloader(device_info)                                    
                                                                                          
    progress = lambda p: print(f"Flashing progress: {p*100:.1f}%")                        
    bootloader.flash(progress, dai.Pipeline())                                                                                                   

Expected behavior
The watchdog timeout should be changed also in the bootloader section of the code.

Here is an example solution that we're working with now (it only disables the watchdog, doesn't change the value):
main...mrmorawski:depthai-core:no_bootloader_watchdog

Attach system log

DEPTHAI_LEVEL=debug DEPTHAI_WATCHDOG=0 poetry run python
Python 3.10.7 (main, Nov 24 2022, 19:45:47) [GCC 12.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
> import depthai as dai
> [2023-03-18 12:18:33.686] [debug] Python bindings - version: 2.20.2.0 from  build: 2023-01-31 23:58:49 +0000
> [2023-03-18 12:18:33.686] [debug] Library information - version: 2.20.2, commit: 4ff860838726a5e8ac0cbe59128c58a8f6143c6c from 2023-01-31 22:20:03 +0200, build: 2023-01-31 23:34:39 +0000
> [2023-03-18 12:18:33.687] [debug] Initialize - finished
> >>> [2023-03-18 12:18:33.757] [debug] Resources - Archive 'depthai-bootloader-fwp-0.0.24.tar.xz' open: 1ms, archive read: 68ms
> [2023-03-18 12:18:34.049] [debug] Resources - Archive 'depthai-device-fwp-8c3d6ac1c77b0bf7f9ea6fd4d962af37663d2fbd.tar.xz' open: 1ms, archive read: 360ms
> 
> >>> info = dai.DeviceInfo("192.168.1.162")
> >>> dev = dai.Device(info)
> [2023-03-18 12:19:41.424] [debug] Found an actual device by given DeviceInfo: DeviceInfo(name=192.168.1.162, mxid=194430108152761300, X_LINK_BOOTLOADER, X_LINK_TCP_IP, X_LINK_MYRIAD_X, X_LINK_SUCCESS)
> [2023-03-18 12:19:41.424] [debug] Device - OpenVINO version: universal
> [2023-03-18 12:19:41.424] [warning] Watchdog disabled! In case of unclean exit, the device needs reset or power-cycle for next run
> [2023-03-18 12:19:41.424] [debug] Device - BoardConfig: {"camera":[],"emmc":null,"gpio":[],"logDevicePrints":null,"logPath":null,"logSizeMax":null,"logVerbosity":null,"network":{"mtu":0,"xlinkTcpNoDelay":true},"nonExclusiveMode":false,"pcieInternalClock":null,"sysctl":[],"uart":[],"usb":{"flashBootedPid":63037,"flashBootedVid":999,"maxSpeed":4,"pid":63035,"vid":999},"usb3PhyInternalClock":null,"watchdogInitialDelayMs":null,"watchdogTimeoutMs":0} 
> libnop:
> 0000: b9 10 b9 05 81 e7 03 81 3b f6 81 e7 03 81 3d f6 04 b9 02 00 01 ba 00 00 be bb 00 bb 00 be be be
> 0020: be be be be 00 bb 00
> [2023-03-18 12:19:41.438] [debug] Searching for booted device: DeviceInfo(name=192.168.1.162, mxid=REMOVED, X_LINK_BOOTLOADER, X_LINK_TCP_IP, X_LINK_MYRIAD_X, X_LINK_SUCCESS), name used as hint only
> [2023-03-18 12:19:42.591] [debug] Connected bootloader version 0.0.24
> [2023-03-18 12:19:57.595] [warning] Monitor thread (device: REMOVED [192.168.1.162]) - ping was missed, closing the device connection
> [2023-03-18 12:19:59.346] [debug] DeviceBootloader about to be closed...
> [2023-03-18 12:19:59.346] [debug] XLinkResetRemote of linkId: (0)
> [2023-03-18 12:20:00.599] [debug] DeviceBootloader closed, 1252
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
> depthai.XLinkWriteError: Couldn't write data to stream: '__bootloader' (X_LINK_ERROR)

Additional context
For our particular application, we're running the cameras in standalone mode, so we don't need an XLink connection to stream frames. We only need to be able to update pipelines on cameras, which I assume will be the usecase of most people running PoE devices in standalone mode.

@mrmorawski mrmorawski added the bug Something isn't working label Apr 3, 2023
@mrmorawski
Copy link
Author

I don't know the depthai-core library well enough to submit a full PR, but the hack listed in 'expected behaviour' works well for us.

Would just mirroring the adjustable watchdog code from DeviceBase.cpp work for DeviceBootloader.cpp as well?

@mrmorawski mrmorawski changed the title [BUG] {Watchdog is not disabled in bootloader} [BUG] Watchdog is not disabled in bootloader Apr 3, 2023
@themarpe
Copy link
Collaborator

themarpe commented Apr 5, 2023

Thanks for the proposed solution @mrmorawski - there is only one issue with this approach, that it is not completely "WD disabled" as BL still has its own WD being counted down. Though BL bumps its WD on any comms that it receives/sends successfully, so the occasion of it timing out is smaller.

Perhaps it could be reworked / put under DEPTHAI_BOOTLOADER_WATCHDOG variable - or in general, the WD timeout just extended further. Have you tried with the latter and what was the time which worked for you?

@mrmorawski
Copy link
Author

thanks for the quick reply @themarpe . I'm aware it's not a full solution, just a quick hack that I put in to make sure it'll work at all. I'll maybe try to experiment with different timeouts next week.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants