Phi3.5V Server API Error: Forward step expected a PagedAttention input metadata. #756

ytnvj2 · 2024-09-06T19:38:49Z

Hi,
I am running the phi3.5 vision model using the below command on Apple M2 macbook:
'cargo run --release --features metal -- --port 1234 vision-plain -m microsoft/Phi-3.5-vision-instruct -a phi3v'

Everything loads fine, but when I query I get this error:

mistralrs_core::engine: prompt step - Model failed with error: Msg("Forward step expected a PagedAttention input metadata. This was not provided, please ensure that the scheduler config is correctly configured for PagedAttention.")

On the other hand, If I load the model using the python API everything works fine but I am not sure how to enable ISQ in python.
''
from mistralrs import Runner, Which, ChatCompletionRequest, VisionArchitecture

runner = Runner(
which=Which.VisionPlain(
model_id="microsoft/Phi-3.5-vision-instruct",
arch=VisionArchitecture.Phi3V,
),
)
''
Any idea what might be causing this and how to fix this?

JCRPaquin · 2024-09-06T23:26:05Z

Same thing happens with Phi-3.5-MoE.

For Phi-3.5-MoE: It looks like the default scheduler path can reach the NormalPipeline, which requires(?) paged attention metadata. Likely something similar is happening for the vision model.

JCRPaquin · 2024-09-06T23:29:25Z

Ah, so the quantized pipelines can handle the lack of paged attention metadata. I'm not clear on why the Python API is routing through a different part of the code, though.

~~@ytnvj2 can you retry with a GGUF version of the model via the CLI?~~ ~~Using a GGUF might wholly disable vision.~~ The GGUF loading panics in an unrelated area.

JCRPaquin · 2024-09-06T23:57:45Z

Potential confounder: I might be misreading things, but it looks like Paged Attention is disabled for non-CUDA targets (including Metal).

mistral.rs/mistralrs-core/src/utils/mod.rs

Lines 225 to 233 in 366f9f0

    
           #[cfg(all(feature = "cuda", target_family = "unix"))] 
        
           pub const fn paged_attn_supported() -> bool { 
        
               true 
        
           } 
        
           #[cfg(not(all(feature = "cuda", target_family = "unix")))] 
        
           pub const fn paged_attn_supported() -> bool { 
        
               false 
        
           }

EricLBuehler · 2024-09-07T02:36:47Z

Hi @JCRPaquin @ytnvj2 thank you for the details and all investigation! Indeed, you are correct the issue arises when there is a discrepancy there. This bug was caused by a regression from #753, and I just merged #759 which should fix this. Can you please confim this works?

JCRPaquin · 2024-09-07T02:41:13Z

@EricLBuehler thanks for the quick response! I'll try the fix in a few hours.

ytnvj2 · 2024-09-07T05:34:27Z

@EricLBuehler Tried out the fix and it works for me now. Thank you for the quick fix. Appreciate it.

ytnvj2 added the bug Something isn't working label Sep 6, 2024

EricLBuehler mentioned this issue Sep 7, 2024

Patch bug when not using PagedAttention #759

Merged

EricLBuehler added the regression Incorrect behavior or performance reduction introduced label Sep 7, 2024

ytnvj2 closed this as completed Sep 7, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Phi3.5V Server API Error: Forward step expected a PagedAttention input metadata. #756

Phi3.5V Server API Error: Forward step expected a PagedAttention input metadata. #756

ytnvj2 commented Sep 6, 2024

JCRPaquin commented Sep 6, 2024

JCRPaquin commented Sep 6, 2024 •

edited

Loading

JCRPaquin commented Sep 6, 2024 •

edited

Loading

EricLBuehler commented Sep 7, 2024

JCRPaquin commented Sep 7, 2024

ytnvj2 commented Sep 7, 2024

Phi3.5V Server API Error: Forward step expected a PagedAttention input metadata. #756

Phi3.5V Server API Error: Forward step expected a PagedAttention input metadata. #756

Comments

ytnvj2 commented Sep 6, 2024

JCRPaquin commented Sep 6, 2024

JCRPaquin commented Sep 6, 2024 • edited Loading

JCRPaquin commented Sep 6, 2024 • edited Loading

EricLBuehler commented Sep 7, 2024

JCRPaquin commented Sep 7, 2024

ytnvj2 commented Sep 7, 2024

JCRPaquin commented Sep 6, 2024 •

edited

Loading

JCRPaquin commented Sep 6, 2024 •

edited

Loading