Skip to content

test: re-enable hamt-test in interop #450

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 8 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions packages/interop/package.json
Original file line number Diff line number Diff line change
Expand Up @@ -96,6 +96,7 @@
"magic-bytes.js": "^1.8.0",
"multiformats": "^13.0.1",
"p-defer": "^4.0.0",
"sinon": "^17.0.1",
"uint8arrays": "^5.0.1",
"wherearewe": "^2.0.1"
},
Expand Down
11 changes: 9 additions & 2 deletions packages/interop/src/verified-fetch-unixfs-dir.spec.ts
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
import { createVerifiedFetch } from '@helia/verified-fetch'
import { expect } from 'aegir/chai'
import { filetypemime } from 'magic-bytes.js'
import sinon from 'sinon'
import { createKuboNode } from './fixtures/create-kubo.js'
import { loadFixtureDataCar } from './fixtures/load-fixture-data.js'
import type { VerifiedFetch } from '@helia/verified-fetch'
Expand Down Expand Up @@ -76,16 +77,22 @@ describe('@helia/verified-fetch - unixfs directory', () => {
})

// TODO: find a smaller car file so the test doesn't timeout locally or flake on CI
describe.skip('HAMT-sharded directory', () => {
describe('HAMT-sharded directory', () => {
before(async () => {
// from https://github.com/ipfs/gateway-conformance/blob/193833b91f2e9b17daf45c84afaeeae61d9d7c7e/fixtures/trustless_gateway_car/single-layer-hamt-with-multi-block-files.car
await loadFixtureDataCar(controller, 'bafybeidbclfqleg2uojchspzd4bob56dqetqjsj27gy2cq3klkkgxtpn4i-single-layer-hamt-with-multi-block-files.car')
})

it('loads path /ipfs/bafybeidbclfqleg2uojchspzd4bob56dqetqjsj27gy2cq3klkkgxtpn4i/685.txt', async () => {
const resp = await verifiedFetch('ipfs://bafybeidbclfqleg2uojchspzd4bob56dqetqjsj27gy2cq3klkkgxtpn4i/685.txt')
const onProgress = sinon.stub()
const resp = await verifiedFetch('ipfs://bafybeidbclfqleg2uojchspzd4bob56dqetqjsj27gy2cq3klkkgxtpn4i/685.txt', { onProgress })
expect(resp).to.be.ok()
const text = await resp.text()
const onProgressEvents = onProgress.getCalls().map(call => call.args[0])
const walkEvents = onProgressEvents.filter((e) => e.type.includes('unixfs:exporter:walk'))
const blockGetEvents = onProgressEvents.filter((e) => e.type === 'blocks:get:providers:get')
expect(blockGetEvents).to.have.length(7)
expect(walkEvents).to.have.length.lessThanOrEqual(8)
Comment on lines +87 to +95
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note that I re-enabled this test on a branch https://github.com/ipfs/helia/tree/test/re-enable-hamt-on-main and it doesn't seem to fail there (it was only failing locally previously), but the events don't show the issues that are seen in ipfs/service-worker-gateway#19 & ipfs/service-worker-gateway#18.

They both have onProgressEvents.length === 41 (which includes blockstore & verified fetch events), so some event is not being surfaced fully from the walk without the changes here.

// npx kubo@0.25.0 cat '/ipfs/bafybeidbclfqleg2uojchspzd4bob56dqetqjsj27gy2cq3klkkgxtpn4i/685.txt'
expect(text).to.equal(`Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nunc non imperdiet nunc. Proin ac quam ut nibh eleifend aliquet. Vestibulum ante ipsum primis in faucibus orci luctus et ultrices posuere cubilia curae; Sed ligula dolor, imperdiet sagittis arcu et, semper tincidunt urna. Donec et tempor augue, quis sollicitudin metus. Curabitur semper ullamcorper aliquet. Mauris hendrerit sodales lectus eget fermentum. Proin sollicitudin vestibulum commodo. Vivamus nec lectus eu augue aliquet dignissim nec condimentum justo. In hac habitasse platea dictumst. Mauris vel sem neque.

Expand Down
1 change: 1 addition & 0 deletions packages/unixfs/package.json
Original file line number Diff line number Diff line change
Expand Up @@ -164,6 +164,7 @@
"@libp2p/interface": "^1.1.2",
"@libp2p/logger": "^4.0.5",
"@multiformats/murmur3": "^2.1.8",
"err-code": "^3.0.1",
Copy link
Member

@achingbrain achingbrain Feb 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use CodeError from @libp2p/interface instead of adding err-code

"hamt-sharding": "^3.0.2",
"interface-blockstore": "^5.2.9",
"ipfs-unixfs": "^11.1.3",
Expand Down
147 changes: 147 additions & 0 deletions packages/unixfs/src/commands/utils/find-shard-cid.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,147 @@
import { decode, type PBLink, type PBNode } from '@ipld/dag-pb'
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all of this was copied over from ipfs-unixfs-exporter

import { murmur3128 } from '@multiformats/murmur3'
import errCode from 'err-code'
import { Bucket, type BucketPosition, createHAMT } from 'hamt-sharding'
import { UnixFS } from 'ipfs-unixfs'
import type { ExporterOptions, ReadableStorage, ShardTraversalContext } from 'ipfs-unixfs-exporter'
import type { CID } from 'multiformats/cid'

// FIXME: this is copy/pasted from ipfs-unixfs-importer/src/options.js
const hashFn = async function (buf: Uint8Array): Promise<Uint8Array> {
return (await murmur3128.encode(buf))
// Murmur3 outputs 128 bit but, accidentally, IPFS Go's
// implementation only uses the first 64, so we must do the same
// for parity..
.slice(0, 8)
// Invert buffer because that's how Go impl does it
.reverse()
}

const addLinksToHamtBucket = async (links: PBLink[], bucket: Bucket<boolean>, rootBucket: Bucket<boolean>): Promise<void> => {
const padLength = (bucket.tableSize() - 1).toString(16).length
await Promise.all(
links.map(async link => {
if (link.Name == null) {
// TODO(@rvagg): what do? this is technically possible
throw new Error('Unexpected Link without a Name')
}
if (link.Name.length === padLength) {
const pos = parseInt(link.Name, 16)

bucket._putObjectAt(pos, new Bucket({
hash: rootBucket._options.hash,
bits: rootBucket._options.bits
}, bucket, pos))
return
}

await rootBucket.put(link.Name.substring(2), true)
})
)
}

const toPrefix = (position: number, padLength: number): string => {
return position
.toString(16)
.toUpperCase()
.padStart(padLength, '0')
.substring(0, padLength)
}

const toBucketPath = (position: BucketPosition<boolean>): Array<Bucket<boolean>> => {
let bucket = position.bucket
const path = []

while (bucket._parent != null) {
path.push(bucket)

bucket = bucket._parent
}

path.push(bucket)

return path.reverse()
}

export async function findShardCid (node: PBNode, name: string, blockstore: ReadableStorage, context?: ShardTraversalContext, options?: ExporterOptions): Promise<CID | undefined> {
if (context == null) {
if (node.Data == null) {
throw errCode(new Error('no data in PBNode'), 'ERR_NOT_UNIXFS')
}

let dir: UnixFS
try {
dir = UnixFS.unmarshal(node.Data)
} catch (err: any) {
throw errCode(err, 'ERR_NOT_UNIXFS')
}

if (dir.type !== 'hamt-sharded-directory') {
throw errCode(new Error('not a HAMT'), 'ERR_NOT_UNIXFS')
}
if (dir.fanout == null) {
throw errCode(new Error('missing fanout'), 'ERR_NOT_UNIXFS')
}

const rootBucket = createHAMT<boolean>({
hashFn,
bits: Math.log2(Number(dir.fanout))
})

context = {
rootBucket,
hamtDepth: 1,
lastBucket: rootBucket
}
}

const padLength = (context.lastBucket.tableSize() - 1).toString(16).length

await addLinksToHamtBucket(node.Links, context.lastBucket, context.rootBucket)

const position = await context.rootBucket._findNewBucketAndPos(name)
let prefix = toPrefix(position.pos, padLength)
const bucketPath = toBucketPath(position)

if (bucketPath.length > context.hamtDepth) {
context.lastBucket = bucketPath[context.hamtDepth]

prefix = toPrefix(context.lastBucket._posAtParent, padLength)
}

const link = node.Links.find(link => {
if (link.Name == null) {
return false
}

const entryPrefix = link.Name.substring(0, padLength)
const entryName = link.Name.substring(padLength)

if (entryPrefix !== prefix) {
// not the entry or subshard we're looking for
return false
}

if (entryName !== '' && entryName !== name) {
// not the entry we're looking for
return false
}

return true
})

if (link == null) {
return
}

if (link.Name != null && link.Name.substring(padLength) === name) {
return link.Hash
}

context.hamtDepth++

const block = await blockstore.get(link.Hash, options)
node = decode(block)

return findShardCid(node, name, blockstore, context, options)
}
18 changes: 13 additions & 5 deletions packages/unixfs/src/commands/utils/resolve.ts
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,8 @@ import { DoesNotExistError, InvalidParametersError } from '../../errors.js'
import { addLink } from './add-link.js'
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of the changes in here are originally from @aschmahmann in #448, except that I copied the contents of ./find-shard-cid.js from ipfs-unixfs-exporter

import { cidToDirectory } from './cid-to-directory.js'
import { cidToPBLink } from './cid-to-pblink.js'
import { findShardCid } from './find-shard-cid.js'
import type { PBNode } from '@ipld/dag-pb/interface'
import type { AbortOptions } from '@libp2p/interface'
import type { Blockstore } from 'interface-blockstore'
import type { CID } from 'multiformats/cid'
Expand Down Expand Up @@ -32,6 +34,12 @@ export interface ResolveResult {
segments?: Segment[]
}

const findLinkCid = (node: PBNode, name: string): CID | undefined => {
const link = node.Links.find(link => link.Name === name)

return link?.Hash
}

export async function resolve (cid: CID, path: string | undefined, blockstore: Blockstore, options: AbortOptions): Promise<ResolveResult> {
if (path == null || path === '') {
return { cid }
Expand Down Expand Up @@ -61,11 +69,11 @@ export async function resolve (cid: CID, path: string | undefined, blockstore: B
} else if (result.type === 'directory') {
let dirCid: CID | undefined

for await (const entry of result.content()) {
if (entry.name === part) {
dirCid = entry.cid
break
}
if (result.unixfs?.type === 'hamt-sharded-directory') {
// special case - unixfs v1 hamt shards
dirCid = await findShardCid(result.node, part, blockstore)
} else {
dirCid = findLinkCid(result.node, part)
}

if (dirCid == null) {
Expand Down