-
Notifications
You must be signed in to change notification settings - Fork 921
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[d3d9] Optimize SWVP devices #4274
Conversation
I can throw this at a bunch of games of course, but I think it would be useful and very helpful (since enhancing the HUD is a trend now) to also add the type of VP (based on device type and on m_isSWVP in case of Mixed) as an element to the D3D9 HUD. The use of D3DCREATE_SOFTWARE_VERTEXPROCESSING devices is AFAIK very limited even in d3d8 and generally only used as a fallback in case HW or Mixed modes fail. A very limited set of games let you pick which to use. |
b06b8fe
to
ccaf5be
Compare
This PR also properly fixes AlpyneDreams#179 , on which we had more or less given up in d8vk. The Supreme Ruler d3d8 games can now be played with correct text rendering even without the "Nvidia driver workaround" configuration option (which affected performance very negatively). |
It has nothing to do with that. |
bbba13c
to
88c6e82
Compare
19792a4
to
115e9d6
Compare
Needs lots of testing.
This makes D3D9 devices, that are configured to always use software vertex processing (so not
MIXED
), always use the late per draw buffer upload path. We copy the vertex data that each specific draw accesses to a temporary buffer and render from that, similar to Up-draws.This makes sense because games that use pure SWVP expect vertex processing to be synchronous which has lead to both bugs and performance problems. For example we used to run into issues when respecting NOOVERWRITE or have dozens or even hundreds of queue syncs per frame. Considering that SWVP is supposed to run on the CPU, the amount of vertices is hopefully small.
I hope this won't impact more modern or demanding games.
The game that inspires this was Phantasmat from this comment:
#4263 (comment)
It uses a single 96,000 byte vertex buffer (POOL = DEFAULT, USAGE = WRITEONLY, FVF != 0) and writes data to it before every single draw. Ofc it also doesn't specify a lock range, so we end up uploading the entire 96 KB buffer over and over again, run out of staging memory and then stall. It is a 2D game, so with this PR we upload 4 vertices for every draw.