-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Proposal: DHT in JS for effective content/service discovery #30
base: main
Are you sure you want to change the base?
Conversation
|
||
<!--Briefly describe the milestones/steps/work needed for this project--> | ||
|
||
The focus of this effort is to build the DHT to function solely in Node.js. Running a DHT in browser is not currently viable. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's a stupid question from someone with little exposure to js ipfs or js ecosystem at large. What is the benefit to building out functionality in node.js as opposed to relying on a delegate go-ipfs nodes? Does the value come from having multiple fully functional implementations? Is there a positive impact on in-browser nodes from delegating DHT functionality to a node.js ipfs node instead of a go-ipfs node?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a stupid question, and both are viable options with pros/cons and I think there is merit in both. I will expand more here when I have time to flush this out, but I think there is significant work that needs to be done improving remote api access to Go nodes and that should be a separate proposal. There is a lot to be gained there for new developers. The DHT in JS is relatively short term in comparison and immediately improves usability of the JS ecosystem of projects for existing developers (we currently have users leveraging JS and Go in hacky ways to get around some limitations in these systems; flexible IPLD in JS, performant DHT in Go).
proposals/dht-js.md
Outdated
|
||
**Current State** | ||
- JS projects rely on delegate and preload nodes to be able to interact with the live network | ||
- PL hosted delegate/preload nodes are often overloaded negatively impact performance of projects |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gmasgras this sounds like something that is soluble from an infra perspective. Can you comment on our current preload node uptime/utilization?
proposals/dht-js.md
Outdated
--> | ||
|
||
**Current State** | ||
- JS projects rely on delegate and preload nodes to be able to interact with the live network |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"JS projects" here is not just the browser, it's node, electron, react native etc. E.g. web, desktop and mobile.
proposals/dht-js.md
Outdated
|
||
**Current State** | ||
- JS projects rely on delegate and preload nodes to be able to interact with the live network | ||
- PL hosted delegate/preload nodes are often overloaded negatively impact performance of projects |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The requirement to preload content also makes JS projects very bandwidth heavy - every piece of data I add to a JS node is transferred to a preload node, this is very expensive in terms of transfer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using DHT delegates suffers from similar problems, except it's sort of worse - we re-publish every hour so content survives the timed garbage collection, which means re-uploading every block, which scales poorly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I understand why delegated routing should be slow/expensive or what preload nodes are necessary in non-browser use cases.
DHT publishing frequency with delegated routing should match what you'd expect with non-delegated routing, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the frequency should be the same, but as currently implemented with delegated routing, if a block has been garbage collected, you end up sending the data to the server each time for it to create the provider record - the whole DAG, not just the root block since the delegate has to be able to supply everything on behalf of your node.
what preload nodes are necessary in non-browser use cases
They're not, at least, they shouldn't be - if the DHT implementation was complete and performant, which it isn't. This proposal is to finish up the work we did durning the hack week and make it so. Then we could turn off preload/delegate for everything that isn't a browser node.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With proper delegated routing shouldn't I be making my own provider records that point at me instead of having ones that points at the preload nodes?
E.g. I ask the delegated router for the 20 closest peers then directly send them provider record puts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
E.g. I ask the delegated router for the 20 closest peers then directly send them provider record puts.
Yes, but this is a limitation in the browser, as we will not be able to dial most of the network
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, but my comment was about non-browser nodes not needing to store their data on preload nodes.
Browser nodes have a very rough time talking to the rest of the network at the moment, so they need to do a lot more delegating of work. Does anything in this proposal really help the browser node situation?
proposals/dht-js.md
Outdated
|
||
**Current State** | ||
- JS projects rely on delegate and preload nodes to be able to interact with the live network | ||
- PL hosted delegate/preload nodes are often overloaded negatively impact performance of projects |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The preload nodes also garbage collect the content periodically so over time the content becomes inaccessible
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry if I'm just totally missing the idea here, but why are preload nodes and garbage collection on them relevant in this discussion? Aren't they only really needed if a node is unreachable (e.g. in a browser)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The point above is that the PL hosted infra is often overloaded which is a risk to project functionality.
I was making the additional point that even if the nodes were not overloaded, they garbage collect anything uploaded to them, so the IPFS magic of 'add a file to your node, cat it from another node' is time-limited if the only way that content makes it to another node is via a preload node.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm confused, this doc states.
The focus of this effort is to build the DHT to function solely in Node.js. Running a DHT in browser is not currently viable.
Putting data in preload nodes is not required to get NodeJS to play well with the rest of the network.
Sharing early, need more time to finish this up.