You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When we are pausing the nexus, all IO must get flushed before
the subsystem pausing completes.
If we can't flush the IO then pausing is stuck forever...
The issue we have seen is that when IO's are stuck there's
nothing which can fail them and allow pause to complete.
One way this can happen is when the controller is failed as
it seems in this case the io queues are not getting polled.
A first fix that can be done is to piggy back on the adminq
polling failure and use this to drive the removal of the
failed child devices from the nexus per-core channels.
A better approach might be needed in the future to be able
to timeout the IOs even when no completions are processed
in a given I/O qpair.
Signed-off-by: Tiago Castro <tiagolobocastro@gmail.com>
0 commit comments