Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

llvm-objcopy produces wrong p_offset when PT_INTERP/PT_LOAD offsets are equal #79887

Closed
chestnykh opened this issue Jan 29, 2024 · 4 comments · Fixed by #80562
Closed

llvm-objcopy produces wrong p_offset when PT_INTERP/PT_LOAD offsets are equal #79887

chestnykh opened this issue Jan 29, 2024 · 4 comments · Fixed by #80562

Comments

@chestnykh
Copy link
Contributor

chestnykh commented Jan 29, 2024

Considering the following layout

Program Headers:
  Type           Offset   VirtAddr           PhysAddr           FileSiz  MemSiz   Flg Align
  PHDR           0x000040 0x0000000000000040 0x0000000000000040 0x000268 0x000268 R   0x8
  INTERP         0x195000 0x0000000000195000 0x0000000000195000 0x00000f 0x00000f R   0x1
      [Requesting program interpreter: ]
  LOAD           0x000000 0x0000000000000000 0x0000000000000000 0x1948c9 0x1948c9 R E 0x1000
  LOAD           0x195000 0x0000000000195000 0x0000000000195000 0x0df960 0x0df960 R   0x1000

llvm-objcopy produces ELF with the output below:

Program Headers:
  Type           Offset   VirtAddr           PhysAddr           FileSiz  MemSiz   Flg Align
  PHDR           0x000040 0x0000000000000040 0x0000000000000040 0x000268 0x000268 R   0x8
  INTERP         0x1948c9 0x0000000000195000 0x0000000000195000 0x00000f 0x00000f R   0x1
      [Requesting program interpreter: ]
  LOAD           0x000000 0x0000000000000000 0x0000000000000000 0x1948c9 0x1948c9 R E 0x1000
  LOAD           >>>>0x1948c9<<<< 0x0000000000195000 0x0000000000195000 0x0df960 0x0df960 R   0x1000

The p_offset of the 2nd PT_LOAD seg has incorrect value obtained from p_memsz of the first PT_LOAD segment
GNU objcopy (on the real ELF i have got) aligns p_offset to p_align of PT_LOAD segment that produces 0x195000 as the value of p_offset for the 2nd PT_LOAD.

chestnykh added a commit to chestnykh/llvm-project that referenced this issue Jan 29, 2024
- When copying the ELF whose layout was described in the test
the llvm-objcopy breaks the last PT_LOAD segment alignment
of p_offset by assignment the `p_filesz` value of prev segment to it.
This patch fixes this case. llvm#79887
@llvmbot
Copy link
Member

llvmbot commented Jan 29, 2024

@llvm/issue-subscribers-tools-llvm-objcopy-strip

Author: Dmitriy Chestnykh (chestnykh)

Considering the following layout ``` Program Headers: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align PHDR 0x000040 0x0000000000000040 0x0000000000000040 0x000268 0x000268 R 0x8 INTERP 0x195000 0x0000000000195000 0x0000000000195000 0x00000f 0x00000f R 0x1 [Requesting program interpreter: ] LOAD 0x000000 0x0000000000000000 0x0000000000000000 0x1948c9 0x1948c9 R E 0x1000 LOAD 0x195000 0x0000000000195000 0x0000000000195000 0x0df960 0x0df960 R 0x1000 ``` llvm-objcopy produces ELF with the output below: ``` Program Headers: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align PHDR 0x000040 0x0000000000000040 0x0000000000000040 0x000268 0x000268 R 0x8 INTERP 0x1948c9 0x0000000000195000 0x0000000000195000 0x00000f 0x00000f R 0x1 [Requesting program interpreter: ] LOAD 0x000000 0x0000000000000000 0x0000000000000000 0x1948c9 0x1948c9 R E 0x1000 LOAD >>>>0x1948c9<<<< 0x0000000000195000 0x0000000000195000 0x0df960 0x0df960 R 0x1000
The p_offset of the 2nd PT_LOAD seg has incorrect value obtained from p_memsz of the first PT_LOAD segment
GNU objcopy (on the real ELF i have got) aligns p_offset to p_align or PT_LOAD segment that produces 0x195000 as the value of p_offset for the 2nd PT_LOAD.
</details>

@jh7370
Copy link
Collaborator

jh7370 commented Jan 30, 2024

It surprised me that something so fundamental as this has only just arisen, so I thought I'd look into it a bit more, to make sure the original report is actually valid.

The ELF gABI states:

As ``Program Loading'' describes in this chapter of the processor supplement, loadable process segments must have congruent values for p_vaddr and p_offset, modulo the page size. This member gives the value to which the segments are aligned in memory and in the file. Values 0 and 1 mean no alignment is required. Otherwise, p_align should be a positive, integral power of 2, and p_vaddr should equal p_offset, modulo p_align.

The segment in question is loadable (by virtue of it being a PT_LOAD segment). It also isn't nested inside another PT_LOAD segment (which would cause separate issues), so it should be able to move independently of other PT_LOAD segments. I think we can assume for now that p_align == the page size, as llvm-objcopy doesn't know the specific page sizes of systems. Clearly p_offset (0x195000) % 0x1000 and p_vaddr (0x195000) % 0x1000 are congruent (i.e. identical), whereas the values after are not, so this does seem to be a bug.

I'll comment more on the PR.

@chestnykh
Copy link
Contributor Author

chestnykh commented Jan 30, 2024

It surprised me that something so fundamental as this has only just arisen, so I thought I'd look into it a bit more, to make sure the original report is actually valid.

The ELF gABI states:

As ``Program Loading'' describes in this chapter of the processor supplement, loadable process segments must have congruent values for p_vaddr and p_offset, modulo the page size. This member gives the value to which the segments are aligned in memory and in the file. Values 0 and 1 mean no alignment is required. Otherwise, p_align should be a positive, integral power of 2, and p_vaddr should equal p_offset, modulo p_align.

The segment in question is loadable (by virtue of it being a PT_LOAD segment). It also isn't nested inside another PT_LOAD segment (which would cause separate issues), so it should be able to move independently of other PT_LOAD segments. I think we can assume for now that p_align == the page size, as llvm-objcopy doesn't know the specific page sizes of systems. Clearly p_offset (0x195000) % 0x1000 and p_vaddr (0x195000) % 0x1000 are congruent (i.e. identical), whereas the values after are not, so this does seem to be a bug.

I'll comment more on the PR.

To reproduce the bug you can save the output of yaml2obj from my test from the PR to the file and read it using readelf -Wl

@MaskRay
Copy link
Member

MaskRay commented Feb 3, 2024

#79889 (review)

I probably need to give some more thought, but I'm fairly confident the proposed fix is incorrect, as it would result in a segment that has a parent (i.e. one where there is some overlap with another segment) moving independently of its parent. Note that according to the comment immediately above the main change in this PR that states that a parent's offset should have already been assigned (and therefore will be correctly aligned), before this point.

I agree that the changed alignment in #79889 is not desired.

My suspicion (but I haven't checked) is that the algorithm for ordering and determining parent/child relationship is incorrect and/or the method to determine the alignment to use is incorrect. In particular, I suspect that the PT_INTERP segment in your original case is being treated as the parent of the second PT_LOAD and therefore its p_align value is the one being used to align the PT_INTERP, which in turn means the second PT_LOAD is also moved with PT_INTERP.

Agreed.

I think the correct fix would be to use the highest alignment in a set of overlapping segments when aligning the first segment in that list, adjusting its position from the "aligned" position according to how far off from the aligned value it was before copying. Alternatively, we treat the parent segment as the highest-alignment section, and adjust the other segments accordingly with it. There are some intricacies involved in both cases that we need to be careful of though.

I think we should just special case PT_LOAD, since it's possible for PT_TLS/PT_GNU_RELRO to have the same alignment as a PT_LOAD. With a PHDRS linker script command, PT_LOAD can be moved after PT_TLS/PT_GNU_RELRO.

Created #80562

@MaskRay MaskRay changed the title llvm-objcopy produces wrong p_offset in the copy of ELF llvm-objcopy produces wrong p_offset when PT_INTERP/PT_LOAD offsets are equal Feb 3, 2024
MaskRay added a commit that referenced this issue Feb 20, 2024
…ual (#80562)

(#79887) When the offset of a PT_INTERP segment equals the offset of a
PT_LOAD segment, we consider that the parent of the PT_LOAD segment is
the PT_INTERP segment. In `layoutSegments`, we place both segments to be
after the current `Offset`, ignoring the PT_LOAD alignment.

This scenario is possible with fixed section addresses, but doesn't
happen with default linker layouts (.interp precedes other sections and
is part of a PT_LOAD segment containing the ELF header and program
headers).

```
% cat a.s
.globl _start; _start: ret
.rodata; .byte 0
.tdata; .balign 4096; .byte 0
% clang -fuse-ld=lld a.s -o a -nostdlib -no-pie -z separate-loadable-segments -Wl,-Ttext=0x201000,--section-start=.interp=0x202000,--section-start=.rodata=0x202020,-z,nognustack
% llvm-objcopy a a2
% llvm-readelf -l a2   # incorrect offset(PT_LOAD)
  Type           Offset   VirtAddr           PhysAddr           FileSiz  MemSiz   Flg Align
  PHDR           0x000040 0x0000000000200040 0x0000000000200040 0x0001c0 0x0001c0 R   0x8
  INTERP         0x001001 0x0000000000202000 0x0000000000202000 0x00001c 0x00001c R   0x1
      [Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
  LOAD           0x000000 0x0000000000200000 0x0000000000200000 0x000200 0x000200 R   0x1000
  LOAD           0x001000 0x0000000000201000 0x0000000000201000 0x000001 0x000001 R E 0x1000
//// incorrect offset
  LOAD           0x001001 0x0000000000202000 0x0000000000202000 0x000021 0x000021 R   0x1000
  LOAD           0x002000 0x0000000000203000 0x0000000000203000 0x000001 0x001000 RW  0x1000
  TLS            0x002000 0x0000000000203000 0x0000000000203000 0x000001 0x000001 R   0x1000
  GNU_RELRO      0x002000 0x0000000000203000 0x0000000000203000 0x000001 0x001000 R   0x1000
```

The same issue occurs for PT_TLS/PT_GNU_RELRO if we PT_TLS's alignment
is smaller and we place the PT_LOAD after PT_TLS/PT_GNU_RELRO segments
(not linker default, but possible with a `PHDRS` linker script command).

Fix #79887: when two segments have the same offset, order the one with a
larger alignment first. In the previous case, the PT_LOAD segment will
go before the PT_INTERP segment. In case of equal alignments, it doesn't
matter which segment is treated as the parent segment.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants