-
Notifications
You must be signed in to change notification settings - Fork 12.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
clang arm assembler does not support pseudo-instruction 'adrl' #24724
Comments
assigned to @rengolin |
FYI - we had a patch to disable adrl in chromeos openssl. The CL is here - |
Can you provide an example assembly file and a command line that you use to reproduce? Here's the documentation about ADRL: http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0489i/Cacecdga.html |
I just attached two test case .s files. Use the command: $ llvm-mc -triple=armv7-linux-gnu $ ./bin/llvm-mc -triple=armv7-linux-gnu ~/adr-invalid.s start:
var:
foo:
/usr/local/google/home/cmtice/adrl.s:22:58: error: unexpected token at start of statement X: |
Thanks Carolina! The examples were great, but after a while investigating how that would be actually implemented, it turns out it's not that simple. The assembler will need a lot of convincing to expand two instructions from an asm input stream. There are ways to do it, of course, but all of them are too big for such a small and isolated feature. For that reason, I want to understand more why can't you just use ADR+ADD/SUB or even with macros. Can you give me an example where it would be far worse? cheers, PS: I see you're using "adrEQl". This is pre-unified syntax and is not supported by LLVM (nor will be). Can you make sure you have ".syntax unified" on all assembly files, and move the predicate to the end on all instructions (ex. "adrlEQ")? This will make it easier for the future, when we sort out the ADRL problems. |
I am trying to compile code from someone else (OpenSSL, in the crypto part) which contains inline assembly, to optimize performance. The inline assembly contains the adrl instruction. The code compiles just fine with GCC (and the GNU as assembler). We would like to be able to use Clang/LLVM, but we need to be able to compile/link this third-party code. I am working on trying to add this capability to LLVM, but since I am learning the compiler code as I go, it's taking me a bit of time. The test cases I gave you came straight from the GNU assembler regression test suite. |
I know. Most IAS bug reports we get are from third-party optimised code.
I understand, but "compiling like GAS" is not our goal. We aim to produce a working assembler with modern and documented assembly constructs. The ADRL case is borderline, because it's documented in ARMASM, but not in the ARM manual, and GAS doesn't seem to support ADRL in Thumb2, either.
No worries, it does take time. :) I'm still working on the implementation, but it is not going to be pretty, and that's why we're trying to avoid it. If you could contact the original author and point out this bug report, I'm hoping he/she could have a look at that particular sequence and try and use something else. At the very least, pre-UAL code will not be accepted by LLVM's ARM assembler, so that'll block you further, even if we do add ADRL. In the end, contacting the original author will be necessary to progress with IAL for OpenSSL.
Right, then we may have a problem. GAS and GCC are GPLv3 and LLVM is not, so we can't reuse GNU code as-is due to licensing problems. Can you send the snippet in the OpenSSL source that you're having problems with, so we can deal with that first? I'll keep in mind the examples you gave already when creating a test for that, but I can't re-use it. |
After some investigation, I found a few problematic issues with implementing this in the LLVM assembler:
This second point will require that we add a multi-instruction fixup, because the immediate can be bigger than any possible encoding (the whole point of ADRL), so the relative result of the final expression will have to be split in two instructions, and encoded differently (depending on which opcode we use for each). That calculation cannot be done in the assembler parser (it's definitely not the place), but it's not yet supported in the streamer because of the type of non-constant multi-instruction fixups. I'm not aware of any such fixup type, so unless there is precedent and the streamer can cope with it, implementing ADRL would require us to ultimately change how we stream instructions into asm and obj. One way to do this is to keep the ADRL instruction valid with virtually any immediate value plus a special fixup, and always add a NOP after it early on in the assembler. This can be done with step 1 above. Later, the MCStreamer would recognise the fixup and re-use the NOP as an extra instruction if needed. Adding a NOP simplifies range calculations and is what's expected in case the ADRL can turn into a single ADD/SUB. Arnaud, James, do you guys know of anything like this being done elsewhere? |
patch for processInstruction to return multiple instructions I'll try to find you the openssl code with the inline asm. |
openssl file with 'adrl' instruction |
Hi Caroline, Thanks for the patch, it's similar to what I've done locally. But that's not enough to translate ADRL into ads and subs for the other reasons I outlined. This patch (or my local version), as it is, doesn't help anything. Did you check with the original developer about re-writing some of their ARM assembly? |
That single use of adrl could be removed by merging the Thumb2 and ARM codes together. I have the impression that the developer thought ADRL was a single instruction and was trying to "optimise" the code on ARM... Can you try to remove the #ifdef / #else adrl / #endif and compile again? --renato |
openssl isn't the only user of adrl... The only other use of adrl in an AOSP 6.0.0_r26 checkout is in Tremolo (vorbis decoder), attaching the relevant file for reference. |
Tremolo file that makes use of adrl I wonder how many more users of adrl exist out there. |
Hi Caroline, Sorry I moved this to me, I was told to look at it and thought no one was working on it. Just to wrap up my comments... I did some investigation and the changes in the assembler would be much bigger than the benefits would bring us, especially since there are only two uses we know of and one of them is easy to replace. Usually, when the instruction is not in the ISA documents, we tend to need a stronger reason to implement it. For instance, in ADRL, GNU tools only implement it for ARM, while ARMCC does also for Thumb2, but not 1. On ARM, the complications are in the MCStreamer (as I described earlier), but in Thumb2 you'll additionally have to worry about IT blocks (I'm guessing this is why GAS didn't do for Thumb2). Our rule of thumb for implementing things not documented in the ARM ARM is:
In the ADRL case, there is an alternative (used for Thumb2 in your attachment), the mapping is not clear ("whatever range two adds can cope with"), it will make code harder to understand, and is not extensively used in the wild. So, there's a pretty strong argument against implementing it. :) cheers, |
Bero, In the same asm file you uploaded there is a sequence of ADR+ADD on the same symbol that could be used on all examples, and all of them reading the symbol ahead of the instruction, so there's no doubt it'd be an ADD, not a SUB. cheers, |
We have patched the version of openssl that we use, to remove the 'adrl' instruction, and I have reached out to the original openssl developer to ask him if he could remove it from openssl (no response so far). You have convinced me that this bug is not worth trying to fix. Should I close it as won't fix, or just leave it open and remove myself from the 'assigned' field? |
Thank you! Having a working patch that is as good as the original code is 99% of the work done to convince people to update their sources. :)
It's up to you, really. There is a similar issue in Android (which is why I was looking at it, too), so feel free to assign the bug to me, and I'll close once we patch the Tremolo sources, or keep it for yourself until both accept the work upstream. Whatever works. :) |
Ok, it's all yours now! ;-) |
Bero, Can you confirm the changes on Tremolo's side? Should be as simple as in openssl. cheers, |
For the sake of completeness, the Tremolo case exhibits problems when the ADR range goes beyond 4096, so ADRL was used. There are two ways to overcome that without ADRL:
Change: ADRL r7, .Label To: ADR r7, .Temp1 This would only break if .Label move up past the ADR instructions, in which case you'd have to use a SUB instead, or if .Label-.TempX becomes too large for the ADD. Since ADD's range is a lot bigger than ADR, you could cope with a lot of new cases.
If the above doesn't help, repeating the maths will: ADR r7, .Label1 as many times as necessary. This is uglier, but works every time as long as you keep them close enough and often enough. |
Alternatives proposed, worked in the wild, so closing this bug. Please, refer to these solutions for future cases and let me know if they don't work on your particular case. |
I've done some more investigation to see what would be needed to make ADRL work in the assembler. In summary I agree with Renato, making ADRL work to the same specification as the GNU assembler and armasm would not be easy, and would likely be rejected upstream as not worth the cost. In general it is quite difficult to get difficult to implement pseudo instructions accepted unless they are officially required by the architecture or ABI. I've put some thoughts on implementation at the end [*] just in case anyone wants to build on them. I'm new to LLVM so it is possible that I've missed something. It isn't possible to replicate the full generality of ADRL in a macro but it is possible to come close. In effect we fix the rotations that are used for each of the instructions, whereas the assembler can use different rotations that could potentially reach further addresses if those addresses were suitably aligned. We also need to choose ahead of time whether to use add (label is defined later) or sub (label has already been defined). I won't claim that these are the optimal or clearest expressions of the macro, but it should be possible to build on them. For example if you know your labels are 4-byte aligned it should be possible to extend the range by altering the masks.
[*] A more difficult problem is that while the first instruction is in effect an ADR which can be transformed by a fixup into either an ADD or a SUB depending on the final value of the label post layout, the second instruction is either an ADD or SUB which cannot currently be transformed by a fixup post layout. The reason is that for ADD and SUB the opcode bits are defined by tablegen and a fixup can only OR extra bits and not clear existing ones. I think it could be possible to either:
Neither of the options fit well within the existing framework. I did consider (ab)using relaxation to change an ADDri into a SUBri and vice-versa, depending on the value of the immediate at layout time. Unfortunately because ADDri and SUBri are inverses the layout algorithm won't terminate. For implementing ADRL it could be possible to fix this by just doing an ADDri to SUBri and always emitting an ADDri, however this would also affect vanilla ADDri instructions in an asymmetric way (When expr isn't known till layout time, you could change an "ADD r0, r0, expr" to "SUB r0, r0, -expr" if expr later turned out to be negative, but you couldn't change "SUB r0, r0, expr" back to "ADD r0, r0, -expr"). It could also be possible to heuristically guess early whether an ADD/SUB is needed based on the expression. However we can't guarantee we would guess correctly in all cases. |
mentioned in issue #4440 |
Just hit this in u-boot's EFI support code. |
@valpackett I agree with previous analysis that supporting adrl in the integrated assembler would be complex. Is it feasible to change u-boot EFI to use an alternative code sequence? |
Not sure. Just documenting another place where it is used, I don't have an urgent need to build u-boot for armv7 with clang, just tried it because clang is my default compiler. |
Extended Description
"adrl rd, a_label" calculates a_label's address using pc-relative addressing, which means the section containing the insn and a_label can be put to any address. Without the support of this pseudo insn, it's more cumbersome to do this. "MOV32" or "LDR rd,=a_label" doesn't work in this scenario because both of these use absolute addressing.
The text was updated successfully, but these errors were encountered: