Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Chunk.Indexed extends IndexedSeq #967

Closed
wants to merge 7 commits into from

Conversation

sideeffffect
Copy link
Contributor

@sideeffffect sideeffffect commented Jan 3, 2025

UPDATE:
Chunk.Indexed now extends IndexedSeq


It probably makes sense for Chunk to extend IndexedSeq. That in turn requires to extend IndexedSeqOps.

The potential con is that more mixins may make it more difficult for the JIT compiler to optimize the code.

Context

What do you think, is it worth it?

@hearnadam
Copy link
Collaborator

I'm fairly confident this is not worth doing. First of all, Chunk is not a standard IndexedSeq - see Append. Secondly, I think this will only give benefit to library authors who are generalizing over Seq as a collection... Users can safely treat Chunk as Indexed unless they have a heavily 'append' optimized workflow. In which case they can convert toIndexed if necessary.

I attempted to write a version of this with Chunk.Indexed extending IndexedSeq, and Chunk extending Seq. However, the methods end up needing to be written in a way that's hard to optimize (ie Indexed#append would need to return Indexed Chunk.

I am planning a PR to hand optimize some Chunk methods that should be pushed this week.

@sideeffffect
Copy link
Contributor Author

Ah, I see, fair enough...

@sideeffffect sideeffffect changed the title Chunk extends IndexedSeq Chunk.Indexed extends IndexedSeq Jan 3, 2025
@sideeffffect
Copy link
Contributor Author

I've played with it a little bit, maybe you'll find this helpful @hearnadam

@hearnadam
Copy link
Collaborator

My fear is that this leaves common patterns very susceptible to being pushed off the hot path.

Chunk.empty is an Indexed Chunk. If someone uses empty + appended to build a collection, they will end up reallocating a new underlying Array for each element being appended.

Is there a particular optimization we can get from IndexedSeq that I am missing?

@sideeffffect
Copy link
Contributor Author

Is there a particular optimization we can get from IndexedSeq that I am missing?

Not that I know of. What prompted me to do this is that I believe interoperability, especially via standard library, is very important. Like e.g. implementing the corresponding interfaces (IndexedSeq).

But that leaves us in a pickle. For something to be an IndexedSeq, the appended must also return an IndexedSeq.

If someone uses empty + appended to build a collection

One another, slightly more involved solution to this that I see is to

  • Make Chunk an IndexedSeq
  • Inside Chunk.Append, utilize good old Vector from the standard library, so that even Chunk.Append can implement efficient indexation

Now, that is a bigger rewrite.

Or we can just leave things as they are, not bothering with IndexedSeq... 🤷

@fwbrasil
Copy link
Collaborator

fwbrasil commented Jan 3, 2025

I'd prefer to avoid adding new interfaces to the implementation because of the potential performance implications. @sideeffffect in case you see concrete use cases for them, please open an issue elaborating the necessity so we can discuss it. I'm closing this for now, I hope it's ok.

@fwbrasil fwbrasil closed this Jan 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants