Add support in L1 kernel handle interrupts posted from the IOMMU #406

ricardon · 2024-11-27T18:39:43Z

The current TDX architecture supports posted interrupts (from Host VMM and IOMMU devices) for TDX L1 VMM (VTL2). For passthrough devices owned by L2 guest (VTL0), Hyper-V does not use posted interrupts. Each HW interrupt from L2 owned device results in a TDEXIT to Hyper-V today. This comes with considerable performance cost.

The L1 kernel uses a single interrupt vector to handle the interrupts of passthrough L2 devices [1]. The L1 kernel relays the interrupt to the L2 guest using a bitfield in which each bit represents an L2 interrupt vector.

To mitigate the described performance cost, the IOMMU can deliver interrupts directly to the L1 kernel using posted-interrupts. Under this scheme, a CPU interrupt vector needs to be assigned to each of the interrupts delived to the L1 kernel.

As stated in [1], the synic is not modeled in Linux as an irq chip or irq domain, and the demultiplexed logical interrupts are not Linux IRQs. This
also implies that interrupt handles cannot be installed using request_irq().

Since each of the posted interrupts needs its own CPU vector, reserved a set of vectors that can be assigned dynamically to L2 guests upon request.*

openvmm userspace can request the assignment using an ioctl

[1]. https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/virt/hyperv/vmbus.rst?h=v6.12#n168

*A more elegant, albeit time-consuming solution would be to implement the synic as an irq_chip or irq_domain leverage the Linux IRQ infrastructure.

cperezvargas · 2024-12-20T17:23:32Z

@ricardon is this issue still open?

ricardon · 2024-12-20T18:01:07Z

Yes @cperezvargas we are still working on it. We currently have a proof-of-concept with changes in HyperV and the OHCL-Linux-Kernel. The proof-of-concept is functional and verifying performance improvements. After that, we need to cleanup the code to have it ready for integration.

ricardon · 2025-02-06T23:12:52Z

Hi @chris-oo balajimc55 and were discussing what to do in case the L2 guest requests too many vectors to L1.

An option can be for the L1 kernel to return error if it runs out of vectors to remap. The error would the L2 guest to launch as errors from IOCTLs are fatal. This would be the easiest implementation.

A more complicated solution would be for user space to handle the error and fallback to the existing proxy interrupt mechanism. As per input from balajimc55, Hyper-V would need changes to propagate and handle fallback to regular proxy interrupts.

chris-oo · 2025-02-07T16:17:25Z

How many vectors are too many, in this case? I don't quite understand why would we need host changes to do the fallback to userspace path - isn't this the path we're doing today for all devices (proxy interrupts?).

I'd prefer if we have the fallback path. Don't we need this, incase the host doesn't support the remapping path, or is the remapping path entirely within the guest?

ricardon · 2025-02-07T18:53:36Z

The experiments we conducted used 6 vectors. I was thinking on reserving 20 to err on the safe side. Also, now that I think about it, I fail to see the need to propagate errors to the host. Perhaps L1 user space can see the posted interrupt IOCTL fail and fallback to the existing posted interrupts. balajimc55 am I missing something?

I agree that we need the fallback path in case posted interrupts are not supported at all. My question was whether there is a use case for a mixture of proxy interrupts along with posted interrupt. Or should we only use posted interrupts if supported?

ricardon · 2025-02-21T21:04:23Z

In my latest code, I reserve 32 CPU interrupt vectors. All CPUs have the same interrupt descriptor table. User space uses a IOCTL to request an interrupt to be remapped. The kernel will remember the mapping and raise the corresponding proxy interrupt when a CPU gets an interrupt in any of the reserved vectors. It will issue a warning if a CPU interrupt that has not been mapped is raised.

If the kernel runs out of interrupts to map it will return an -ENODEV error. At such point user space needs to decide how to proceed.

chris-oo added the tdx TDX specific bugs or features label Dec 13, 2024

cperezvargas added this to the OpenHCL support for the GA of TDX CVMs in Azure milestone Dec 20, 2024

chris-oo added the ohcl-linux-kernel Changes that apply to the Linux kernel at OHCL-Linux-Kernel repo label Jan 14, 2025

cperezvargas assigned ricardon Jan 29, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support in L1 kernel handle interrupts posted from the IOMMU #406

Add support in L1 kernel handle interrupts posted from the IOMMU #406

ricardon commented Nov 27, 2024

cperezvargas commented Dec 20, 2024

ricardon commented Dec 20, 2024 •

edited

Loading

ricardon commented Feb 6, 2025

chris-oo commented Feb 7, 2025

ricardon commented Feb 7, 2025

ricardon commented Feb 21, 2025

Add support in L1 kernel handle interrupts posted from the IOMMU #406

Add support in L1 kernel handle interrupts posted from the IOMMU #406

Comments

ricardon commented Nov 27, 2024

cperezvargas commented Dec 20, 2024

ricardon commented Dec 20, 2024 • edited Loading

ricardon commented Feb 6, 2025

chris-oo commented Feb 7, 2025

ricardon commented Feb 7, 2025

ricardon commented Feb 21, 2025

ricardon commented Dec 20, 2024 •

edited

Loading