Supporting fractional cpu count on the `ray` runner #3808

NellyWhads · 2025-02-14T02:46:50Z

Is your feature request related to a problem?

I have been attempting to use a rather useful feature of ray distributed - fractional CPU resource requests.

When this didn't work, I dug through the code base to find this:

Line 446 in f9a4b70

def _get_ray_task_options(resource_request: ResourceRequest) -> dict[str, Any]:

Is there anything daft can do to overcome this? Could I get some more information on why this is a limitation?

To work around this, I have to set up a thread pool within each UDF and manually tune the partitioning logic to achieve the functionality.

Describe the solution you'd like

True support for fractional CPU utility as indicated by the type hints here: https://www.getdaft.io/projects/docs/en/stable/api_docs/udf.html

The "float" suggests fractional resource requests are supported equally for CPU and GPU. I believe this is a rather fundamental feature of the ray runner!

Describe alternatives you've considered

Joblib parallelization within each udf - this makes the user responsible for tuning resource allocation between the ray scheduler and each task's thread pool.

Additional Context

No response

Would you like to implement a fix?

No

The text was updated successfully, but these errors were encountered:

jessie-young · 2025-02-14T20:58:57Z

Thanks for raising this - we're going to explore a few options and will get back to you!

jessie-young · 2025-02-19T02:18:53Z

In Daft, we enforce a minimum of 1 CPU for UDFs even though Ray technically allows for fractional CPU allocation (< 1). There are two main reasons for this:

Function Fusion and Resource Requirements

When a UDF is part of a pipeline where it's fused with other operations, we take the maximum resource requirements across all steps. Since all other operations in Daft require at least 1 CPU, setting a UDF to use < 1 CPU would effectively be ignored as it would be raised to 1 CPU anyway due to the max resource requirement calculation.

Resource Thrashing Prevention

When Ray workers are allocated fractional CPUs, they can experience "thrashing" - where multiple workers compete for the same CPU resources, leading to frequent context switches and degraded performance. By enforcing a minimum of 1 CPU, we ensure more stable and predictable performance for UDF execution.

So while it might seem limiting to not allow < 1 CPU allocation, this decision helps ensure consistent performance and avoids potential issues with resource allocation and scheduling in the Ray backend.

However, you raise an interesting question - what's the difference between this approach and setting Ray's CPU resources to < 1.0?

The key difference is control and locality. When you manage threading within a UDF:

The threads share the same process space and memory
You have explicit control over the threading behavior
Context switching happens within your process, managed by the OS thread scheduler

In contrast, when Ray allocates fractional CPUs:

Multiple separate Ray worker processes compete for CPU resources
The scheduling is handled by Ray's distributed scheduler
Context switching happens at the process level, which is more expensive

Regarding your use case - could you share more details about what you're trying to achieve? Understanding your specific requirements (e.g., throughput goals, memory constraints, data processing patterns) would help us suggest the most appropriate approach. There might be other optimization strategies we could explore. For example, if it’s a common pattern, potentially implementing your logic as a native Daft operation would provide better performance than a UDF.

NellyWhads added enhancement New feature or request needs triage labels Feb 14, 2025

jessie-young added p1 Important to tackle soon, but preemptable by p0 and removed needs triage labels Feb 14, 2025

jessie-young self-assigned this Feb 14, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Supporting fractional cpu count on the `ray` runner #3808

Supporting fractional cpu count on the `ray` runner #3808

NellyWhads commented Feb 14, 2025

jessie-young commented Feb 14, 2025

jessie-young commented Feb 19, 2025

Supporting fractional cpu count on the ray runner #3808

Supporting fractional cpu count on the ray runner #3808

Comments

NellyWhads commented Feb 14, 2025

Is your feature request related to a problem?

Describe the solution you'd like

Describe alternatives you've considered

Additional Context

Would you like to implement a fix?

jessie-young commented Feb 14, 2025

jessie-young commented Feb 19, 2025

Function Fusion and Resource Requirements

Resource Thrashing Prevention

Supporting fractional cpu count on the `ray` runner #3808

Supporting fractional cpu count on the `ray` runner #3808