Use arrays to insert mutliple records into a table #1194

YohDeadfall · 2020-01-12T18:15:34Z

As demonstrated in npgsql/npgsql#2779 (comment) INSERT INTO ... FROM unnest(...) gives near the same performance as COPY, but it allows to use an auto prepared statement. So it's worth to go away from batching and take advantages of arrays.

The text was updated successfully, but these errors were encountered:

roji · 2020-01-12T23:47:33Z

I think you mean this as an alternative to #113, right?

but it allows to use an auto prepared statement

Do you see this as some sort of advantage of array inserting over COPY? If so, can you explain? COPY doesn't require statements in general and is still faster than any sort of INSERT unless I'm mistaken.

However, if I'm understanding npgsql/npgsql#2779 (comment), correctly, you are able to get back all the auto-generated IDs of inserted rows, right? If so, this is definitely a big advantage over COPY, which AFAIK doesn't support this - and could indeed allow us to use array inserts as a replacement for the current batch insert mechanism.

YohDeadfall · 2020-01-13T05:05:16Z

Yes, an alternative. The only one limitation of this method is that it doesn't not allow to insert columns which type is an array. To workaround this a record should be inserted as a composite which requires a DTO and mapping.

Another way is to wrap an array into a composite to have a jagged array. These composites should be generated during migration process.

roji · 2020-01-13T13:42:32Z

This is starting to sound a bit complicated :) But maybe a temporary composite type would be OK... In any case, given the rareness of array types, even if we implement this only for non-array inserts that would still be a useful thing. Would you like to give this a try?

YohDeadfall · 2020-01-13T13:46:21Z

Sure, but probably I'll need some guidance here.

roji · 2020-01-13T19:00:07Z

Sure thing. The service responsible for sending commands in the update pipeline is IModificationCommandBatchFactory (so in our case NpgsqlModificationCommandBatch) - start by looking at that.

Note that the update pipeline takes care of all update types (insert, update, delete), and operations are already ordered in a certain way that's important; for example, a certain insert might only be legal after a delete (because they have the same unique value), so you can't just reorder things in order to batch them more efficiently. The approach should probably be to selectively identify batches of contiguous inserts of the same type, and apply the optimization there. I'd also definitely exclude the case of array property insertion (which requires additional complexity), at least from a first attempt.

roji · 2022-03-23T21:18:18Z

Duplicate of #113

YohDeadfall added enhancement New feature or request performance labels Jan 12, 2020

YohDeadfall self-assigned this Jan 13, 2020

roji added this to the Backlog milestone May 24, 2020

YohDeadfall removed their assignment Feb 17, 2021

roji marked this as a duplicate of #113 Mar 23, 2022

roji closed this as completed Mar 23, 2022

roji removed this from the Backlog milestone Mar 23, 2022

roji removed enhancement New feature or request performance labels Mar 23, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use arrays to insert mutliple records into a table #1194

Use arrays to insert mutliple records into a table #1194

YohDeadfall commented Jan 12, 2020

roji commented Jan 12, 2020

YohDeadfall commented Jan 13, 2020

roji commented Jan 13, 2020

YohDeadfall commented Jan 13, 2020

roji commented Jan 13, 2020

roji commented Mar 23, 2022

Use arrays to insert mutliple records into a table #1194

Use arrays to insert mutliple records into a table #1194

Comments

YohDeadfall commented Jan 12, 2020

roji commented Jan 12, 2020

YohDeadfall commented Jan 13, 2020

roji commented Jan 13, 2020

YohDeadfall commented Jan 13, 2020

roji commented Jan 13, 2020

roji commented Mar 23, 2022