-
Notifications
You must be signed in to change notification settings - Fork 7
v0.4.0 - Unable to pin large directories #19
Comments
Sorry you're having this problem @sdockray What are the specs of the machines? (The go runtime and our use of it is way more expensive than it should be) |
I'm not sure what to put exactly, but the basics are below. Origin: Destination: |
The 65GB and 264GB of memory seems like virtual mem to me. i'm wondering how much RAM the machines actually have, |
On the origin dmidecode shows 4 slots like this one:
also
on the destination I unfortunately don't have root access and so the best i can do right now is:
|
There's a PR under way which should improve pinning quite a bit: ipfs/kubo#2384 |
That PR has been merged. @sdockray can you try on the new version, if this is still a problem? |
I have been using distributed ipfs versions, so I did the go install, with gx, etc. just now and tried again and within a couple minutes (before actually pinning anything) the pin crashes ipfs
I will keep trying with smaller large directories and see if there is some limit to the size or something |
@sdockray does go-ipfs give any interesting error log? What is the size of the file were are talking about? And before the |
the error log from the crash is 56M and i can't see anything particularly noteworthy in it but i'm also not sure what i'm looking for.. at the beginning and end i didn't see anything eye catching. here is clip of beginning and end.
and as for the size...
i tried to pin some of the large subdirectories (about 1G) inside the big target directory and they are recursively pinned just fine. |
Hi @sdockray, we've recently released go-ipfs 0.4.0 and with that a lot of perf and mem improvements came along. Could you check if you still have this issue with the latest version? |
updated i was able to finish adding the directory - first pass got 99% of the contents added. then i kept getting "blockservice is closed" until i stopped the ipfs daemon and tried again. then it completed the whole directory. moving on the testing the pinning now. Hi @diasdavid - all of my comments on and after March 26 were with 0.4.0 (before that they were with 0.4.0-dev). I tried (a week ago and again just now) to start over again (re-adding the large directory before trying to re-pin it) and unfortunately under 0.4.0 I can no longer add the directory as I consistently get the following error (but at completely different stages of the process, 1%, 3%, 99%):
So unfortunately, I am stuck trying to recursively add this large directory (which never gave me problems in 0.4.0-dev) before I can test pinning it. |
OK the pin failed again on a freshly added directory (407G) , running 0.4.0 everywhere.
outputs (after a few minutes):
the pin process and the daemon both exit. the rest of the error log is like the one above. |
@sdockray could I get the full output log from when it crashes? That would be very helpful. Also, a few things I'd like to have you try:
|
the full log is here: https://hostb.org/AW I haven't tried changing the Datastore.NoSync config item yet. I have set the IPFS_REUSEPORT env var |
I'm not sure what the cause of your specific failure is, but it looks like theres an issue in the bitswap provider connector. Its basically not working fast enough and thats causing a buildup of extra goroutines all the way back up the line. I have an open PR ( ipfs/kubo#2437 ) that improves that codepath so i will put effort into landing that sooner and report back |
It looks like @whyrusleeping managed to land ipfs/kubo#2437. Does it help with your issue? I'm also interested in pinning large directories. |
@spinda - I tried with version 0.4.2 and got the same behavior, but maybe it is in the 0.4.3-rc? I can't see the change specifically cited in the change log but it does mention that it "improves the performance of Storage, Bitswap, as well as Content and Peer Routing" so maybe? In the next few days, I'll try that version and report back |
The major improvements are in 0.4.3-rc1. |
thanks for the info.. will try and get it updated soon % ipfs update --verbose install v0.4.3-rc1 fetching go-ipfs version v0.4.3-rc1 - using GOOS=linux and GOARCH=amd64 - fetching "/ipns/dist.ipfs.io/go-ipfs/v0.4.3-rc1/go-ipfs_v0.4.3-rc1_linux-amd64.tar.gz" - using local ipfs daemon for transfer - writing to /tmp/ipfs-update776635258/go-ipfs_v0.4.3-rc1_linux-amd64.tar.gz - extracting binary to tempdir: /tmp/ipfs-update032086215/ipfs-new binary downloaded, verifying... - running init in '/dir/ipfs/update-staging/test495205521' with new binary - checking new binary outputs correct version install failed, reverting changes... ERROR: version didnt match |
That huge mistake on our part. We didn't increase version of go-ipfs in the binary to 0.4.3-rc1 from 0.4.3-dev. For now you can try updating manually from https://dist.ipfs.io |
fyi, i downloaded the RC manually and % tar xvfz go-ipfs_v0.4.3-rc1_linux-amd64.tar.gz % mv go-ipfs/ipfs /dir/work/bin/ % ipfs version ipfs version 0.4.3-dev will just work with dev version for now |
It really is rc1, we just made a mistake and didn't update the name of the version in the code. |
I've re-added everything and then tried to pin the root directory again and unfortunately am still having it fail ipfs pin add QmevLmheGTxdZSQ9Vwwnb2SkY4n9SNFiznYWjRKY5z3KEK > ipfs_pin.log & ... after 30-60 minutes ... Error: Post http://127.0.0.1:5001/api/v0/pin/add?arg=QmevLmheGTxdZSQ9Vwwnb2SkY4n9SNFiznYWjRKY5z3KEK&encoding=json&stream-channels=true: EOF |
@sdockray when that failure happens, does the daemon print anything? |
@whyrusleeping: No, the daemon exits without outputting anything |
As a general question, what is the largest directory anyone has pinned? Or would it have more to do with the number if items to be pinned (during recursion)? If you have about 1/2TB free you can test with the hash above (although it seems unlikely that you'll actually pull more than a few gb before it fails) |
@lgierth recently pinned the CCC archives, which i think was over 2TB. |
3.7 TB to be precise, but that was a plain add with the default of |
I think the problem here is |
I'm still having this issue with 0.4.5-dev . When trying to pin a large folder, ipfs uses up over 3 gigs of ram and is then killed by oom_killer. |
I guess this is pretty much ipfs/kubo#3318 |
I gave my system a ton of virtual ram and let my pin run overnight. It's still working on the pin ! And using 8.7 gigs of ram. I took the profiling stats which are at https://gateway.ipfs.io/ipfs//QmZmseT7n9MptemPcxkuWh9FeAPK7rPeTNMDiPQWTft4Qo . Writing them out was very slow. I had to forcibly kill and reboot the daemon to complete the add. |
I don't know go, but it seems notable that there are over 4500 goroutines in ipfs.stack that are |
This issue was moved to https://discuss.ipfs.io/t/v0-4-0-unable-to-pin-large-directories/512 |
I have been experimenting with adding a large directory on one node (origin) and then attempting to pin that directory on another node (destination). When I simply do:
ipfs pin add HASH
on the destination, few, if any, hashes get pinned, CPU and memory load are extremely high and ipfs eventually crashes.As an alternative, I captured all of the hashes that came from the original add operation (one per line) and wrote this simple script to pin hashes one by one: https://github.com/sdockray/ipfs_pin_remote_hashes/blob/master/ipfs_pin_remote_hashes.sh
It is successful for a few hundred hashes (about 4-7gb). Along the way it will take several minutes, sometimes close to an hour for a single hash, and then eventually I start seeing
Error: pin: merkledag: not found
and the daemon has silently gone away.The origin is Debian Jessie. I am still collecting data about the destinations but they are typically Debian as well, and all are running 0.4.0-dev. I don't see a whole lot in the logs that is very helpful.
The text was updated successfully, but these errors were encountered: