aschmahmann changed the topic of #ipfs to: Heads Up: To talk, you need to register your nick! Announcements: go-ipfs 0.7.0 and js-ipfs 0.52.3 are out! Get them from dist.ipfs.io and npm respectively! | Also: #libp2p #ipfs-cluster #filecoin #ipfs-dev | IPFS: https://github.com/ipfs/ipfs | Logs: https://view.matrix.org/room/!yhqiEdqNjyPbxtUjzm:matrix.org/ | Forums: https://discuss.ipfs.io | Code of Conduct: https://git.io/vVBS0
Arwalk has quit [Read error: Connection reset by peer]
JonOsterman has quit [Read error: Connection reset by peer]
JonOsterman has joined #ipfs
JonOsterman has quit [Max SendQ exceeded]
JonOsterman has joined #ipfs
unnamed55355 has joined #ipfs
JonOsterman has quit [Max SendQ exceeded]
royal_screwup213 has quit [Quit: Connection closed]
royal_screwup213 has joined #ipfs
JonOsterman has joined #ipfs
royal_screwup213 has quit [Ping timeout: 246 seconds]
lawid has quit [Remote host closed the connection]
lawid has joined #ipfs
jesse22_ has joined #ipfs
jesse22 has quit [Ping timeout: 276 seconds]
FayneAldan[m] has joined #ipfs
royal_screwup213 has joined #ipfs
JonOsterman has quit [Remote host closed the connection]
JonOsterman has joined #ipfs
<Discordian[m]>
Welcome Fayne Aldan, this room is significantly more active 😀.
<FayneAldan[m]>
Oh.
arcatech has joined #ipfs
Jad has joined #ipfs
<proletarius101>
`ipfs files ls` lists the directories. But it seems `ipfs filestore ls` only lists the files? Any ways to retrieve the cid of the directory?
hsn has quit [Remote host closed the connection]
hsn has joined #ipfs
<Discordian[m]>
Filestore has no concept of directories, it's the layer that handles calls being redirected to files instead of blocks. To get the CID of a directory either make note of it when you add it, or add it to MFS after for easy human-readable tracking proletarius101 .
mowcat has quit [Remote host closed the connection]
<Discordian[m]>
You could also just re-add the directory again, IPFS will deduplicate it anyways.
<proletarius101>
<Discordian[m] "You could also just re-add the d"> it's huge (in the google cloud as I asked yesterday) so costly...
<Discordian[m]>
Ah ... yeah you have to take note of CIDs, it's the address of the content
<Discordian[m]>
MFS is useful for giving CIDs a human readable name/path
<proletarius101>
<Discordian[m] "Filestore has no concept of dire"> So it seems not possible to add them in-batch? in `ipfs filestore ls` there are the filestore dir tagged
<proletarius101>
it seems I have to add them using a script... right? but that's fine
<Discordian[m]>
Dir tagged? I don't think I've noticed that. Filestore lists blocks, the index 0 ones are the "first block".
<Discordian[m]>
You can add a directory using `ipfs add <dir>`
<Discordian[m]>
* You can add a directory using `ipfs add --nocopy <dir>`
<Discordian[m]>
After the add, it'll give you a CID of the dir
<Discordian[m]>
<proletarius101 "it seems I have to add them usin"> Oh to MFS after? Yeah
royal_screwup213 has quit [Quit: Connection closed]
<Discordian[m]>
`ipfs-sync` does that automatically. You point it to a dir to sync, it adds it to IPFS, adds it to MFS, syncs with IPNS, then pins it.
royal_screwup213 has joined #ipfs
<proletarius101>
<Discordian[m] "Dir tagged? I don't think I've n"> e.g. `bafkreibhuhvbsmgieiix*** 262*** dir/filename.file 135***`
<Discordian[m]>
Oh yeah, that's just the local path so it can find the block data there
<proletarius101>
<Discordian[m] "You can add a directory using `i"> that effectively reads all file to compute the hashes, which cost something, and is so slow...
<Discordian[m]>
Yeah, still in batch though, it's faster than 1 at a time
JonOsterman has quit [Remote host closed the connection]
<Discordian[m]>
Trust me on that one, ipfs-sync is one at a time, and it's much slower as a result.
<Discordian[m]>
First sync should be batch, at least
<proletarius101>
<Discordian[m] "Yeah, still in batch though, it'"> Not in batch afaik? and for 50GB it takes 1h
<Discordian[m]>
add does it in batch if you point it to a dir. 50GB in 1hr tbh is pretty good
JonOsterman has joined #ipfs
<proletarius101>
<Discordian[m] "add does it in batch if you poin"> well, then probably i should adjust my expectation lol
<Discordian[m]>
FWIW the default DB is slow, BadgerDB is much faster, but RAM hungry
<CSDUMMI[m]>
Just a little Idea, that I wanted to know if it would be possible, useful and practical to implement:
<CSDUMMI[m]>
* Just a little Idea about how to do Documentation on IPFS, that I wanted to know if it would be possible, useful and practical to implement:
<kallisti5[m]>
<Discordian[m] "FWIW the default DB *is* slow, B"> After my experiences... I think the default flat is better for "sources of big data", and badger is better for nodes simply pinning content
<kallisti5[m]>
badgerdb being in memory, I had some hard to solve corruption which resulted in me pretty much having to start over
<Discordian[m]>
Ah makes sense
<Discordian[m]>
The DB backend could use some love
<Discordian[m]>
MongoDB when?
<jwh>
stopppp
<proletarius101>
<kallisti5[m] "After my experiences... I think "> How are they different? They all store copies, right?
<jwh>
but if we're suggesting backends, s3 :D
<Discordian[m]>
<jwh "but if we're suggesting backends"> S3 has a backend :D Totally an option
<kallisti5[m]>
<proletarius101 "How are they different? They all"> badgerdb keeps a bunch of data in ram. If you suffer a sudden powerloss or crash, it can corrupt your on-disk database of CID's
<jwh>
would actually be nice if I didn't have to duplicate content that I already have in my minio cluster, much duplication everywhere
<jwh>
ohhhh
<jwh>
nice, thanks!
<proletarius101>
<kallisti5[m] "badgerdb keeps a bunch of data i"> That's unexpected. Thanks for that info
<kallisti5[m]>
What made it really painful was IPFS was chugging along happily. Essentially I had CID's on disk which didn't actually contain the data they said they did.
<kallisti5[m]>
If you tried to pin my content, stuff just hung indefinitely once it hit those CID's. Externally IPFS said it didn't have those chunks... while internally my IPFS node said it did
<Discordian[m]>
You can break filestore for some interesting errors too, I've been meaning to dive in and see if I can make it a bit easier to use
<kallisti5[m]>
One big issue is progress reporting on pinning. --progress gives some really unhelpful information
<kallisti5[m]>
just Fetched/processed: XXXX nodes
<kallisti5[m]>
if it runs into a missing chunk, it just hangs indefinitely with that message.
<jwh>
Discordian: oh hm, that isn't a "raw" store right? being able to serve up a fixed addr, probably ipns, with a given bucket would be nice, then it's pretty easy to serve up existing content - would probably have to index it or something though to calculate CIDs
<Discordian[m]>
<jwh "Discordian: oh hm, that isn't a "> Not 100% sure what you mean by raw, but it'll store the blocks in there. I'm not 100% sure if sharing works ... might need to experiment ;)
<Discordian[m]>
<kallisti5[m] "if it runs into a missing chunk,"> Ak, I hate it when pinning fails. I'm thinking of turning off pinning as a default for ipfs-sync. Adding to MFS works fine
<kallisti5[m]>
It would be nice to show "[XXXX of XXXX] Searching for chunk (CID)..." , "[XXXX of XXXX] Getting chunk (CID)...", etc
<kallisti5[m]>
that would give you some input into what's happening. (I know it gets multiple chunks at a time... but all solvable in a cli)
<Discordian[m]>
A super verbose option, basically
<Discordian[m]>
I have a super verbose option for ipfs-sync
<Discordian[m]>
It's nice to get the data dump during debugging
Arwalk has quit [Read error: Connection reset by peer]
<kallisti5[m]>
eh. I'd argue folks want to see more about what IPFS is doing... not less
<kallisti5[m]>
I mean... look at docker pull's ui
<jwh>
well like, serve up an existing store (kinda like how the raw leaves stuff works), so if you have an existing bucket it would be possible to index the objects and serve those up under a given key, in theory of course, in reality it's probably a nightmare
<Discordian[m]>
True, there does seem to be some confusion as to exactly what certain things do
<Discordian[m]>
Lol imagine if IPFS made a MASSIVE list of every single block it was going to find, then assaulted your terminal refreshing it
<kallisti5[m]>
ipfs doesn't even need to show the progress... just that it's doing it
<Discordian[m]>
I see what you mean though, could be pretty nice
<kallisti5[m]>
i've never seen more than 10 or 15 chunks under "want"
<Discordian[m]>
<jwh "well like, serve up an existing "> Unfortunately I don't understand buckets too well (never actually used them yet)
<Discordian[m]>
Last time we were going to, we used MFS
<kallisti5[m]>
so... theoretically there's a limit to how parallel it is?
<Discordian[m]>
I'm not sure how it'd determine it tbh
<Discordian[m]>
I know in terms of CPU resources, it'll automatically scale to your CPUs
Arwalk has joined #ipfs
<Discordian[m]>
As for IO, not sure how it decides how much to do at once
<jwh>
<Discordian[m] "Unfortunately I don't understand"> really, buckets are just a directory under which some files live, except they're referenced by object ids because its an http api rather than a real filesystem, simplistic view of them, but thats really what they are on the surface
<Discordian[m]>
Are the IDs random?
<Discordian[m]>
Sounds like IPFS without CID indexing
<jwh>
yeah, but only in the same way an inode is
<Discordian[m]>
I see how that could get wonky
<jwh>
so like, if you request a CID that is actually a directory, ipfs knows what lives under that
<Discordian[m]>
That's basically what I thought, I guess I figured there'd be some easy way to have multiple IPFS nodes use the same data uh ... maybe ipfs-cluster could help out?
<jwh>
similar idea
<jwh>
a tree based layout is obviously a logical choice, which is why most things do that
<Discordian[m]>
It is duplication
<jwh>
yeah, but still kinda need to duplicate it, I'm trying to find the right word
<Discordian[m]>
So not wrong hmm
<Discordian[m]>
A little redundancy isn't bad ;p
<jwh>
stateless sharing of content I guess
<Discordian[m]>
Like check if content was already indexed?
<jwh>
you'd just need to make sure that CIDs etc were generated for the content via some indexing method
<Discordian[m]>
Err
<Discordian[m]>
Added to IPFS?
<Discordian[m]>
Yeah on ipfs-sync I do that with LevelDB
<jwh>
well more in the same way you'd let a web server generate an index for an existing directory, but you'd have to index the content
<Discordian[m]>
But it's not setup to coordinate with other nodes (yet)
<Discordian[m]>
Oh like MFS?
<jwh>
sort of
<Discordian[m]>
Oh like ... route to the node with the content, each node only has the content they're serving, other nodes don't have the same pieces?
<jwh>
I guess maybe an s3 backed mfs sort of thing
<Discordian[m]>
<jwh "I guess maybe an s3 backed mfs s"> Ah yeah
<Discordian[m]>
Could put all data on a networked storage drive. Use filestore, and then your blocks would be small.
<Discordian[m]>
Doesn't work for pins
<Discordian[m]>
That way you only have 1 copy of the data you want to serve
<Discordian[m]>
Only duplicating blocks/hashes
<jwh>
like, if I have a bunch of data already on an s3 bucket, would be nice to be able to point ipfs at it and say "watch this bucket and generate and update CIDs and ipns key for it"
<Discordian[m]>
So basically, if filestore worked with S3...
<jwh>
hm I guess
<Discordian[m]>
URLstore can store URLs, can S3 be used with an HTTP API?
<jwh>
I mean, you can mount s3 buckets as a "filesystem" using fuse these days
<Discordian[m]>
Ah, fuse mount might be "best"
<jwh>
yeah but it needs to be aware of the s3 api
<jwh>
its a bit weird, because amazon came up with it
<Discordian[m]>
Oh man using Amazons things is so weird
<Discordian[m]>
Weird but powerful
<kallisti5[m]>
i'm fuse mounting Haiku's S3 buckets to ipfs add them
<kallisti5[m]>
It's slow as shit, but works
<Discordian[m]>
Filestore, or duplicating?
<jwh>
guess something to watch the directory and add changed/new files might work, or does ipfs already have support for that now?
<kallisti5[m]>
I'm just using dumb shell scripts to look for builds, and add them if ipfs doesn't have them yet
<jwh>
lets see
<Discordian[m]>
Actually, I'll look at it a bit tonight, maybe tomorrow. See if I can push an update or 2 out.
<kallisti5[m]>
all of our CI/CD uploads to s3. Then I pull down and add to ifps
<Discordian[m]>
* Actually, I'll look at it a bit tonight, maybe tomorrow. See if I can push an commit or 2 out, maybe an update.
<Discordian[m]>
Okay neat
<kallisti5[m]>
then re-publish using our private key and ipfs files flush
<Discordian[m]>
Stable, lots of storage needed tho
<kallisti5[m]>
Discordian: it would be *NICE* if ipfs-sync could accept s3 credentals / endpoints and do something similar :-D
<jwh>
yeah that was my going to be my test case, publish our existing minio backed package repos with ipfs, without duplicating the whole repo everywhere
<Discordian[m]>
Hey I'm going to make an issue for that
JonOsterman has quit [Remote host closed the connection]
<Discordian[m]>
I wonder if just `go install` would work fine
<jwh>
it could be a bit more useful in its output, by hinting what actually needs to be done (if thats building some go binary locally and telling it where to look etc)
<jwh>
presumably somewhere in PATH will do
<Discordian[m]>
But yeah I agree, does lack instructions
<jwh>
should be pretty easy for that issue to be resolved by just adding an alpine builder (not using alpine here but musl is musl)
<Discordian[m]>
Might be worth showing your interest on that issue, or opening one about the error message.
<Discordian[m]>
Ah alright
<Discordian[m]>
Could submit a patch ;p
<Discordian[m]>
I'm not Alpine experienced, and heard of musl today
<kallisti5[m]>
tldr; musl is a much smaller libc Generally applications (even go applications and their static nature) have to be compiled on a musl platform to work on musl platforms.
<kallisti5[m]>
I think there is a go cross target for musl
<jwh>
judging from the comments its all in hand, just needs effort which a patch probably won't really help with as they'll need to setup build infra
<Discordian[m]>
<jwh "judging from the comments its al"> There's talk on there for adding a bounty though, must be able to be opened to the public
<Discordian[m]>
tbh I agree, full Go binaries for this seems overkill
<Discordian[m]>
But whatever works I guess (at the cost of issues like exactly this tho)
yoav_ has joined #ipfs
yoav_ has quit [Remote host closed the connection]
<jwh>
anyway yeah, had to install all the binaries and its ok now (except make install isn't a valid target anymore)
<Discordian[m]>
Well at least that's a success :D
koo555 has joined #ipfs
<jwh>
yeah
<Discordian[m]>
So great question ... if you remove the file, blocks are removed on gc, yes. IF, the CID isn't referenced in MFS or pins.
<jwh>
ok now that's done, lets see how well filestore works
<jwh>
what happens when the underlying file disappears though, does ipfs do garbage collection and remove missing content, or does it just return not found
<jwh>
which might suck if it is in the list
<jwh>
~20 mins to hash these packages :(
<jwh>
hm
<Discordian[m]>
This is really only a problem if you delete the file, and then add it somewhere else immediately, unchanged.
<Discordian[m]>
You can get wonky errors, otherwise should be totally fine
<Discordian[m]>
The errors are also fixable, just annoying
kiltzman has quit [Ping timeout: 248 seconds]
<jwh>
yeah, guess I'll find out soon enough
<Discordian[m]>
Soon ipfs-sync will be able to hound bad blocks selectively and nuke them. Rn it can do a full scan for them, but it takes a long time on certain setups
<Discordian[m]>
I think I have most of that work done IIRC, been a month since I've looked at it though
kiltzman has joined #ipfs
rodolf0 has joined #ipfs
rodolf0 has quit [Client Quit]
safe has joined #ipfs
JonOsterman has quit [Remote host closed the connection]
JonOsterman has joined #ipfs
JonOsterman has quit [Remote host closed the connection]
JonOsterman has joined #ipfs
Arwalk has quit [Read error: Connection reset by peer]
Arwalk has joined #ipfs
drathir_tor has quit [Ping timeout: 240 seconds]
Jad has quit [Quit: Benefits I derive from freedom are largely the result of the uses of freedom by others, and mostly of those uses of freedom that I could never avail myself of.]
drathir_tor has joined #ipfs
JonOsterman has quit [Remote host closed the connection]
JonOsterman has joined #ipfs
<kallisti5[m]>
ugh. Think i'm hitting the "no directory sharding issue"
<kallisti5[m]>
trying to enter a directory with 6261 files times out
Arwalk has quit [Read error: Connection reset by peer]
<Discordian[m]>
I think you can enable it
<Discordian[m]>
Lemme check my "big dir"
<Discordian[m]>
Mine 19.3k files, no sharding
<kallisti5[m]>
wha?
<kallisti5[m]>
so sharding no longer required?
<Discordian[m]>
It is, I just don't hit the limit at 19.3k yet