Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use arrays to insert mutliple records into a table #1194

Closed
YohDeadfall opened this issue Jan 12, 2020 · 6 comments
Closed

Use arrays to insert mutliple records into a table #1194

YohDeadfall opened this issue Jan 12, 2020 · 6 comments

Comments

@YohDeadfall
Copy link
Contributor

As demonstrated in npgsql/npgsql#2779 (comment) INSERT INTO ... FROM unnest(...) gives near the same performance as COPY, but it allows to use an auto prepared statement. So it's worth to go away from batching and take advantages of arrays.

@YohDeadfall YohDeadfall added enhancement New feature or request performance labels Jan 12, 2020
@roji
Copy link
Member

roji commented Jan 12, 2020

I think you mean this as an alternative to #113, right?

but it allows to use an auto prepared statement

Do you see this as some sort of advantage of array inserting over COPY? If so, can you explain? COPY doesn't require statements in general and is still faster than any sort of INSERT unless I'm mistaken.

However, if I'm understanding npgsql/npgsql#2779 (comment), correctly, you are able to get back all the auto-generated IDs of inserted rows, right? If so, this is definitely a big advantage over COPY, which AFAIK doesn't support this - and could indeed allow us to use array inserts as a replacement for the current batch insert mechanism.

@YohDeadfall
Copy link
Contributor Author

Yes, an alternative. The only one limitation of this method is that it doesn't not allow to insert columns which type is an array. To workaround this a record should be inserted as a composite which requires a DTO and mapping.

Another way is to wrap an array into a composite to have a jagged array. These composites should be generated during migration process.

@roji
Copy link
Member

roji commented Jan 13, 2020

This is starting to sound a bit complicated :) But maybe a temporary composite type would be OK... In any case, given the rareness of array types, even if we implement this only for non-array inserts that would still be a useful thing. Would you like to give this a try?

@YohDeadfall
Copy link
Contributor Author

Sure, but probably I'll need some guidance here.

@YohDeadfall YohDeadfall self-assigned this Jan 13, 2020
@roji
Copy link
Member

roji commented Jan 13, 2020

Sure thing. The service responsible for sending commands in the update pipeline is IModificationCommandBatchFactory (so in our case NpgsqlModificationCommandBatch) - start by looking at that.

Note that the update pipeline takes care of all update types (insert, update, delete), and operations are already ordered in a certain way that's important; for example, a certain insert might only be legal after a delete (because they have the same unique value), so you can't just reorder things in order to batch them more efficiently. The approach should probably be to selectively identify batches of contiguous inserts of the same type, and apply the optimization there. I'd also definitely exclude the case of array property insertion (which requires additional complexity), at least from a first attempt.

@roji roji added this to the Backlog milestone May 24, 2020
@YohDeadfall YohDeadfall removed their assignment Feb 17, 2021
@roji
Copy link
Member

roji commented Mar 23, 2022

Duplicate of #113

@roji roji marked this as a duplicate of #113 Mar 23, 2022
@roji roji closed this as completed Mar 23, 2022
@roji roji removed this from the Backlog milestone Mar 23, 2022
@roji roji removed enhancement New feature or request performance labels Mar 23, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants