ExecuteUpdate on Oracle and sub-optimal query #30742

anoordover · 2023-04-21T20:12:44Z

In oracle you have two ways to update multiple rows from one table with data from another table:

use an ordinary update statement (simpler and more frequently used SQL statement)
use a so called merge statement (developers need to get used to these kind of statements)

Both these statement yield to a different execution plan where the second one is better when updating and querying a lot of rows.
Before EF Core version 7 you could only do these kind of update using raw SQL.
From EF Core version 7 you can also use ExecuteUpdate to do these kinds of updates.

My opinion is that when we choose raw SQL because of this issue we need to measure if this issue is still viable and change our code to use ExecuteUpdate when this situation isn't viable anymore (Oracle DB optimizer, EF Core or the Oracle EF Core implementation might change).
Or we might accept the performance impact and use ExecuteUpdate so me en my colleagues only need to have knowledge about EF Core.

What would you do:

Go for the optimal raw SQL statement and check the viability of the reason behind this choice
Go for the optimal raw SQL and don't do any measurements after the implementation
Go for the sub-optimal solution using ExecuteUpdate and trust that implementors will eventually fix these performance problems and go for a single technology

roji · 2023-04-21T21:02:57Z

@anoordover I'm familiar with MERGE from SQL Server and PostgreSQL, and as far as I know it isn't typically used for the same things that UPDATE is used; while UPDATE can be used to bulk-update many rows in a single table, MERGE is typically used to merge data from one table (or from the client) into another table.

Can you please provide more context on exactly what you're doing, showing the (apparently inferior) UPDATE and (apparently superior) MERGE, as well as the plans for these?

Go for the sub-optimal solution using ExecuteUpdate and trust that implementors will eventually fix these performance problems and go for a single technology

I'd say there's very little chance that ExecuteUpdate will be changed to use MERGE under the covers (although we're not the ones maintaining the Oracle driver). ExecuteUpdate really models the SQL UPDATE statement, MERGE is a different statement with its own syntax etc.

anoordover · 2023-04-22T07:16:17Z

I will continue my investigation on Monday. But for now:
For now I think this might be true (I'm open to corrections...)
It might be that the execution plan of the MERGE statement is different from an ordinary UPDATE statement because the UPDATE statement updates all the (filtered) rows of the target table even if no matching row can be found in the source table (the field(s) will be updated to null?).
The MERGE statement only updates rows in the target table when there is a matching row in the source table.
I think that in the MERGE statement Oracle does a "hash join" on the two table to update the target table.

anoordover · 2023-04-22T07:25:21Z

When my previous comment is correct I think:

the executeupdate statement will never be executed as a MERGE statement
the Oracle execution plan of the UPDATE statement will not/never be the same as the execution plan of the MERGE statement
for optimized queries rawsql might still be needed
to prevent rawsql in my application something like a mergeupdate statement should be developed in EF Core (when feasible) ;)

When writing merge statements is difficult for a team they could use Open AI to rewrite an UPDATE statement to a MERGE statement ;)

roji · 2023-04-22T07:53:41Z

@anoordover I highly recommend bringing a concrete sample here with an actual UPDATE and MERGE which are equivalent, but where the MERGE is apparently more efficient - this is a bit too abstract to reason about without seeing actual SQL statements.

If there's indeed a category of cases where a SQL UPDATE can be transformed to a MERGE and that MERGE works more efficiently, then it may indeed make sense for ExecuteUpdate to to do that. If this is true for other databases which support MERGE (SQL Server, PostgreSQL), then this is something we could consider doing in EF Core itself. Otherwise, the Oracle team could add this optimization to their own provider.

But again, we need to see actual statements and plans.

anoordover · 2023-04-22T13:10:23Z

Important indexes and table:
INDIVIDUALCLAIM (100.000 rows):
  - unique CLAIMID (reference to a claim), REFERENCE (referencenumber for the individual claim)
INDIVIDUALPAYMENT (150.000 rows):
  - unique PAYMENT_ID (reference to a payment), INDIVIDUALCLAIM_REFERENCE (functional reference to the individual claim that is paid with this individual payment)
Both these tables have a master table (CLAIM for INDIVIDUALCLAIM and PAYMENT for INDIVIDUALPAYMENT)

UPDATE "XYZ"."INDIVIDUALPAYMENT" "d"
SET "d"."INDIVIDUALCLAIMID" = (
    SELECT "v"."ID"
    FROM "XYZ"."INDIVIDUALCLAIM" "v"
    WHERE (((("v"."CLAIMID" = 4) AND ("d"."INDIVIDUALCLAIM_REFERENCE" = "v"."REFERENCE"))))
        FETCH FIRST 1 ROWS ONLY)
WHERE "d"."PAYMENT_ID" = 3


Plan hash value: 2362689481

---------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                      | Name                                     | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------------------------------------
|   0 | UPDATE STATEMENT               |                                          |   154K|  4379K|   619K (25)| 00:00:25 |
|   1 |  UPDATE                        | INDIVIDUALPAYMENT                        |       |       |            |          |
|*  2 |   TABLE ACCESS FULL            | INDIVIDUALPAYMENT                        |   154K|  4379K|   583   (1)| 00:00:01 |
|*  3 |   VIEW                         |                                          |     1 |    26 |     3   (0)| 00:00:01 |
|*  4 |    WINDOW BUFFER PUSHED RANK   |                                          |     1 |    21 |     3   (0)| 00:00:01 |
|   5 |     TABLE ACCESS BY INDEX ROWID| INDIVIDUALCLAIM                          |     1 |    21 |     3   (0)| 00:00:01 |
|*  6 |      INDEX UNIQUE SCAN         | IX_INDIVIDUALCLAIM_CLAIMID_REFERENCE     |     1 |       |     2   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

"   2 - filter(""d"".""PAYMENT_ID""=3)"
"   3 - filter(""from$_subquery$_003"".""rowlimit_$$_rownumber""<=1)"
   4 - filter(ROW_NUMBER() OVER ( ORDER BY  NULL )<=1)
"   6 - access(""v"".""CLAIMID""=4 AND ""v"".""REFERENCE""=:B1)"


MERGE INTO "XYZ"."INDIVIDUALPAYMENT" "d"
USING (
    SELECT "v"."ID", "v"."REFERENCE"
    FROM "XYZ"."INDIVIDUALCLAIM" "v"
    WHERE "v"."CLAIMID" = 4
) "v"
ON ("d"."INDIVIDUALCLAIM_REFERENCE" = "v"."REFERENCE")
WHEN MATCHED THEN UPDATE SET "d"."INDIVIDUALCLAIMID" = "v"."ID"
                  WHERE "d"."PAYMENT_ID" = 3;


Plan hash value: 2818844525

---------------------------------------------------------------------------------------------------
| Id  | Operation            | Name               | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------------
|   0 | MERGE STATEMENT      |                    |     1 |   127 |       |  2194   (1)| 00:00:01 |
|   1 |  MERGE               | INDIVIDUALPAYMENT  |       |       |       |            |          |
|   2 |   VIEW               |                    |       |       |       |            |          |
|*  3 |    HASH JOIN         |                    |     1 |   122 |  3408K|  2194   (1)| 00:00:01 |
|*  4 |     TABLE ACCESS FULL| INDIVIDUALCLAIM    |   105K|  2166K|       |   617   (1)| 00:00:01 |
|   5 |     TABLE ACCESS FULL| INDIVIDUALPAYMENT  |   154K|    14M|       |   583   (1)| 00:00:01 |
---------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

"   3 - access(""d"".""INDIVIDUALCLAIM_REFERENCE""=""v"".""REFERENCE"")"
"   4 - filter(""v"".""CLAIMID""=4)"

Please let me know if you need more information

In the plan for the MERGE-statement you see the HASH JOIN that I mentioned before.

Thinking about these differences triggered me to think about what I would expect when executing an executeupdate statement that uses a source table for updating a field in the target table. My question is: Is it the expected behavior that also the rows in the target table get updated when there is no row being found in the source table or should you be forced to define a where clause on the target table so that only rows will be selected having a matching row in the source table?

anoordover · 2023-04-22T13:27:26Z

Adding a where clause on the target table in the UPDATE statement gives me the plan added below.
So maybe it isn't an issue...

Plan hash value: 2995876849
 
-----------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                      | Name                                     | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------------------------------------------
|   0 | UPDATE STATEMENT               |                                          |     1 |    45 |       |  1179   (1)| 00:00:01 |
|   1 |  UPDATE                        | INDIVIDUALPAYMENT                        |       |       |       |            |          |
|*  2 |   HASH JOIN RIGHT SEMI         |                                          |     1 |    45 |  2896K|  1172   (1)| 00:00:01 |
|*  3 |    INDEX FAST FULL SCAN        | IX_INDIVIDUALCLAIM_CLAIMID_REFERENCE     |   105K|  1650K|       |   147   (1)| 00:00:01 |
|*  4 |    TABLE ACCESS FULL           | INDIVIDUALPAYMENT                        |   154K|  4379K|       |   583   (1)| 00:00:01 |
|*  5 |   VIEW                         |                                          |     1 |    26 |       |     3   (0)| 00:00:01 |
|*  6 |    WINDOW BUFFER PUSHED RANK   |                                          |     1 |    21 |       |     3   (0)| 00:00:01 |
|   7 |     TABLE ACCESS BY INDEX ROWID| INDIVIDUALCLAIM                          |     1 |    21 |       |     3   (0)| 00:00:01 |
|*  8 |      INDEX UNIQUE SCAN         | IX_INDIVIDUALCLAIM_CLAIMID_REFERENCE     |     1 |       |       |     2   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
"   2 - access(""S"".""REFERENCE""=""d"".""INDIVIDUALCLAIM_REFERENCE"")"
"   3 - filter(""S"".""CLAIMID""=4)"
"   4 - filter(""d"".""PAYMENT_ID""=3)"
"   5 - filter(""from$_subquery$_003"".""rowlimit_$$_rownumber""<=1)"
   6 - filter(ROW_NUMBER() OVER ( ORDER BY  NULL )<=1)
"   8 - access(""v"".""CLAIMID""=4 AND ""v"".""REFERENCE""=:B1)"

roji · 2023-04-22T13:32:39Z

@anoordover OK. I'd be surprised if Oracle provided no way to do such a simple UPDATE efficiently, forcing users to resort to MERGE syntax instead.

Specifically for your first UPDATE above, that contains a correlated subquery, i.e. it references d.INDIVIDUALCLAIM_REFERENCE from the outer table, meaning that the subquery needs to run for each row in the outer table (INDIVIDUALPAYMENT). This is indeed always bad for perf, also for non-UPDATE queries and on other databases. Most databases allow expressing JOINs instead - just like you can do in regular queries; that does the same thing but much more efficiently. I'm no Oracle expert, so I'm not sure what's possible here.

Note that in any case, #4526 may introduce a way to express MERGE - although that issue still requires lots of careful thinking and design. If it really makes sense in the Oracle context to implement some types of ExecuteUpdate via MERGE rather than UPDATE, then the maintainers of the Oracle provider may be able to add that ability to their provider specifically.

anoordover · 2023-04-22T13:54:07Z

As a side note...
I still need to update my issue on the Oracle implementation.
The syntax mentioned in that issue assigns the same alias to the source and target table.
But I will investigate this later this weekend or on Monday. This seems to be caused by the fact that we prefix all our tables with the schema-name, e.g. XYZ_CLAIM.

Did you notice the temp storage usage in the last plan and the MERGE plan?

Second side note: Oracle says that they don't support executeupdate yet (but they don't throw an exception)

roji · 2023-04-23T13:33:25Z

@anoordover it seems we're discussing several different things (specifically in oracle/dotnet-db-samples#270).

AFAIK basic ExecuteUpdate does work with Oracle. You seem to have encountered a specific bug with the invalid SQL - that's likely to be a problem on the Oracle side (though not sure), I'd advise tracking that separately from the other ExecuteUpdate-related problems.

michiproep · 2023-07-17T06:41:35Z

@roji It seems you are right. Simple executeUpdates work but if you have a complex where-query, Oracle actually generates a
UPDATE...
SET
FROM (...)
WHERE...
statement where the FROM-syntax is not even supported by Oracle.

roji · 2023-07-17T07:28:04Z

@michiproep thanks for confirming - it's probably a good idea to make Oracle aware of this.

anoordover added the customer-reported label Apr 21, 2023

ajcvickers added the closed-external label May 3, 2023

ajcvickers closed this as not planned Won't fix, can't repro, duplicate, stale May 3, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ExecuteUpdate on Oracle and sub-optimal query #30742

ExecuteUpdate on Oracle and sub-optimal query #30742

anoordover commented Apr 21, 2023

roji commented Apr 21, 2023

anoordover commented Apr 22, 2023 •

edited

Loading

anoordover commented Apr 22, 2023 •

edited

Loading

roji commented Apr 22, 2023

anoordover commented Apr 22, 2023 •

edited

Loading

anoordover commented Apr 22, 2023

roji commented Apr 22, 2023

anoordover commented Apr 22, 2023 •

edited

Loading

roji commented Apr 23, 2023

michiproep commented Jul 17, 2023

roji commented Jul 17, 2023

ExecuteUpdate on Oracle and sub-optimal query #30742

ExecuteUpdate on Oracle and sub-optimal query #30742

Comments

anoordover commented Apr 21, 2023

roji commented Apr 21, 2023

anoordover commented Apr 22, 2023 • edited Loading

anoordover commented Apr 22, 2023 • edited Loading

roji commented Apr 22, 2023

anoordover commented Apr 22, 2023 • edited Loading

anoordover commented Apr 22, 2023

roji commented Apr 22, 2023

anoordover commented Apr 22, 2023 • edited Loading

roji commented Apr 23, 2023

michiproep commented Jul 17, 2023

roji commented Jul 17, 2023

anoordover commented Apr 22, 2023 •

edited

Loading

anoordover commented Apr 22, 2023 •

edited

Loading

anoordover commented Apr 22, 2023 •

edited

Loading

anoordover commented Apr 22, 2023 •

edited

Loading