-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ath10k-ct wave1 firmware crashes with usteer #206
Comments
Sorry, I have just noticed #180 which looks awfully similar |
Yes, it is same as Stintel's report. Corrupted descriptor, and I have no idea how to fix it. I guess it is possible it is some kernel related issue, like kernel use-after-free or something. And quite strange that usteer would have anything to do with this...maybe strace on usteer to see if it does anything interesting in same period as the crashes? Here is decode: EPC1 0x00958080 ROM: memcpy /home/customer/tree/RD-2011.2/p4root/Xtensa/Target-libs/newlib/newlib/libc/machine/xtensa/memcpy.S:135 Process State Register (PS): [ 1457.436154] ath10k_pci 0000:00:00.0: Copy Engine register dump: |
Ok, more info, as requested: dmesg of a crash:
strace of a mark (writing to /dev/kmsg)
relevant (i think) part of usteer strace
(I attach longer portion as a file) |
Finally I had some time to do more experiments and I believe the problem is related to management frames. Still, if usteer is not running then every mode is fine so I guess it might be related to "steering" done to client via netlink? I have yet to try different firmware version to see if behavior changes. |
I spent my evening testing different firmwares. To keep it brief, here is a list of ones I tested and they still crashed: (version as reported, source URL, sha256 sum of firmware file)
the oldest firmware from lede series did "work" (it did not crash that is) but I suspect the reason it did is because being so old it wasn't able to do what driver wanted from it. In syslog I got messages similiar to this:
HOWEVER what did work for me (I don't know I can declare it 100% stable but it did not crash on me so far with usteer running and management frame protection enabled) was using NRCC firmware with options suggested in the manual (vdevs = 4, peers = 80)
To me this is obviously testing a black box but maybe this data will help connect the dots and get to the bottom of this. |
This is good detective work. The nrcc thing would compile out the firmware logic to swap rate-ctrl memory to the host. It decreases the number of stations that can connect, but may be reasonable work-around. I am not sure I feel like spending the time to try to fix the root problem if it is down in the memory swapping logic. Does usteer + pmf/mfp work on stock ath10k firmware on this platform? |
Ok, I tested some more firmware files ale I think that nrcc thing might have been a red herring and it is more random memory-related issue. If I were to guess I'd say that various versions of the firmware have slightly different memory layout which makes them more or less susceptible to crashing and while my environment (usteer + mfp + custom build) is a good catalyst for this problem to arise (crash within a minute or two) I fully expect eventually seeing "good" firmware experience a crash. Some observations: As I mentioned previously firmware
worked fine for me so I decided to try version without
and this one was unstable. Not sure what the difference is, does not seem to be documented anywhere. The firmware:
was stable when run with from (i presume) "latest stable" series those firmwares were ok:
and this one wasn't:
And yes, the latest official firmware does work OK (tested with ath10k-ct driver):
There isn't much consistency to all of this and I doubt there is really any usefull information here. Sorry for wasting your time. I am still hoping for a fix in the future but in the meantime I will probably pick a firmware that passed my initial test and see how it behaves "in production" |
Can reproduce crashes on qca988x with DAWN (instead of usteer) as well, on Linux 5.15.98
|
Hello,
I've been experimenting with openwrt on QCA9558, specifically OpenMesh MR1750 AP to see if I can use it as reasonably fast, reliable, modern access point and I got into situation where I repeatably (every 30s or so) get firmware crashes if I am running openwrts usteer daemon. When I stop usteer service wifi stack seems to be rock stable. I have not (yet) investigated how this service interacts with wifi driver but this might be a clue anyway.
At the moment I am running custom openwrt build (minimal kernel, ALSR disabled and so on to check if I can gain any performance this way) so I can't be sure that this applies to more common setups but if firmware really crashes because of it then it might.
in regards of non-default options in wireless config i have:
crash frequency: (usteer running [stock config], single station connected)
single occurrence in dmesg:
binary dump:
crash-ath10k-ct.bin.gz
The text was updated successfully, but these errors were encountered: