Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ipfs object diff is buggy for links with the same name. #7044

Closed
kevincox opened this issue Mar 27, 2020 · 6 comments
Closed

ipfs object diff is buggy for links with the same name. #7044

kevincox opened this issue Mar 27, 2020 · 6 comments
Labels
help wanted Seeking public contribution on this issue kind/bug A bug in existing code (including security flaws)

Comments

@kevincox
Copy link

kevincox commented Mar 27, 2020

Version information:

% ipfs version --all
go-ipfs version: 0.4.23-6ce9a355f
Repo version: 7
System version: amd64/linux
Golang version: go1.14

Description:

TL;DR when diffing all links on the LHS are compared to the first link with the same name on the RHS.

All blocks in the LHS are apparently replaced with the first block of the RHS.
% </dev/urandom head -c $((4 * 256 * 1024)) >example
% ipfs add example
added QmaRtHG2gjqNNc8MZsQRUzWWXRijjcFXrkLB7iAfydZk5m example
% truncate -s $((2*256 * 1024)) example
% ipfs add example
added QmPZLgNZFMRwBBUsmiaafcdBJcioFjLaRhbHK2iQKJfjVv example
% ipfs object diff QmaRtHG2gjqNNc8MZsQRUzWWXRijjcFXrkLB7iAfydZk5m QmPZLgNZFMRwBBUsmiaafcdBJcioFjLaRhbHK2iQKJfjVv
~ QmQSPr6hYdmyB3Rc8fX4EVKEbDK22GfJFwjATHA2omW5Yu QmY6HJFqYmmDsxoSgPiLbGdzP2ao6YPtZgUm8Kh6FXFgdK ""
~ QmSR5DvV5jMCo3BYFHf1oA7gpJs6H8sbjEUFUAivpDUbsH QmY6HJFqYmmDsxoSgPiLbGdzP2ao6YPtZgUm8Kh6FXFgdK ""
~ QmR4Ke67KCEZLusG4HobvmhjZvvFesm6qWV1Y5dPx4VXK2 QmY6HJFqYmmDsxoSgPiLbGdzP2ao6YPtZgUm8Kh6FXFgdK ""
File reporting no diff.

The setup is to create a file with all blocks identical. In this example I created a file with two all-zero blocks.

Then you create a second file with the first block matching and anything else after that. Note that not even the number of blocks or the size of the blocks or files matter. As long as the first block matches ipfs object diff will report no diff.

% </dev/zero head -c $((2 * 256 * 1024)) >example
% ipfs add example                                                                                              
added QmeBAFpC3fbNhVMsExM8uS23gKmiaPQJbNu5rFEKDGdhcW example
% </dev/zero head -c $((1 * 256 * 1024)) >example
% </dev/urandom head -c $((1 * 256 * 1024)) >>example
% ipfs add example
added QmS1giEBzzWfHUujmVpraMu42d1t2yFAZoM9WAQ6H4Fyvs example
% ipfs object diff QmeBAFpC3fbNhVMsExM8uS23gKmiaPQJbNu5rFEKDGdhcW QmS1giEBzzWfHUujmVpraMu42d1t2yFAZoM9WAQ6H4Fyvs
% sha256sum <(ipfs cat QmeBAFpC3fbNhVMsExM8uS23gKmiaPQJbNu5rFEKDGdhcW) <(ipfs cat QmS1giEBzzWfHUujmVpraMu42d1t2yFAZoM9WAQ6H4Fyvs)
e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855  /proc/self/fd/11
e40c8ebecc9097039429b24e7a3510706aff99555941f3d33bf4ae98fe35c3fd  /proc/self/fd/12

Suggestions.

  • Diff the Data field as well as the Links.
  • Use a proper diff algorithm that won't compare against the same RHS block twice.
    • Alternatively for anything that has blocks with the same name don't recurse, just report a difference.

Other concerns.

The output is ambiguous when the links have no names. Something should be done to make this explicit.

@kevincox kevincox added the kind/bug A bug in existing code (including security flaws) label Mar 27, 2020
@hsanjuan hsanjuan added the help wanted Seeking public contribution on this issue label Mar 27, 2020
@schomatis
Copy link
Contributor

I think we did some work toward fixing this. I can take another look at this but note we are aiming to deprecate the object API.

@schomatis schomatis self-assigned this Dec 23, 2021
@schomatis
Copy link
Contributor

Looking into this.

@schomatis
Copy link
Contributor

We indeed do not process correctly unnamed links, see
https://github.com/ipfs/go-merkledag/blob/437cdd83c8e85406ba782bb456bb654d09ec3d27/dagutils/diff.go#L119-L121

I'll need confirmation from someone in the team about:

  1. The object diff command is still in use (not going to be deprecated in the very short term) and a fix for this would be valuable.
  2. What would be the expected behavior of comparing two lists of unnamed links.

@kevincox
Copy link
Author

IDK if it is in use, but I was going to use it but since it doesn't work I found another solution. On the other hand if it doesn't work probably no one uses it. But it would be nice if there was a solution that did work.

@schomatis
Copy link
Contributor

Yes, the main issue is that we're planning on deprecating the object subcommand entirely, but I'm not sure the timeline on that or whether someone's currently depending on this to make it worthwhile investing more time here.

@lidel
Copy link
Member

lidel commented Jan 21, 2022

We did deprecate ipfs object * but some commands like diff have no replacement (yet).
My vote would be to invest time in adding ipfs dag diff (#4801) which works with all traversable codecs, rather than fixing this deprecated API that only works with dag-pb.

@lidel lidel closed this as completed Jan 21, 2022
@schomatis schomatis removed their assignment Jan 24, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Seeking public contribution on this issue kind/bug A bug in existing code (including security flaws)
Projects
None yet
Development

No branches or pull requests

4 participants