<TimMc>
so capped at about 10 vCPU and 46.5 GB RAM (but not *using* all of that)
<TimMc>
and adding vCPU across machine types, well...
digitalcircuit has quit [Ping timeout: 250 seconds]
digitalcircuit has joined #sandstorm
TC01 has joined #sandstorm
<kentonv>
TimMc, almost all the VMs are in single-digit percent utilization of CPU. The system was designed to be a lot more scalable than it needed to be, I guess. >_>
<kentonv>
in fact self-hosted (single-machine) sandstorm on a beefy instance would probably have handled the load fine. Oops.
<simpson>
On GCE, it doesn't matter quite as much. I suppose it depends on what's on each machine.
<kentonv>
the gateway is a g1-small, the workers are n1-highmem-2, and the rest are n1-standard-1
<kentonv>
master could probably be reduced to g1-small and probably storage could too.
<kentonv>
but I worry about subtle performance loss
<kentonv>
we could also probably go to just one shell
<simpson>
Mm. Are you running full systemd? As I've containerized, I've found that that's actually one of the costs, and that there's been a modest savings from running more stuff on k8s.
<kentonv>
these are full VMs. Some of the things could maybe run in containers but the workers definitely can't since they do a lot of root-only stuff, like setting up nbd devices.
<simpson>
Mm, makes sense. It was only recently that I was able to get my Tahoe-LAFS storage servers off of VMs, and for similar reasons: Wiring up storage is non-trivial.
<mokomull>
ooh, nbd? that's kind of my life these days :)
<kentonv>
mokomull, yeah Blackrock makes heavy use of nbd in order to give each grain its own virtual-volume that's actually maintained on the remote storage server.
<kentonv>
it's my favorite crazy systems hack
<mokomull>
haha you're in good company, though ... ISTR someone big was using Ceph via a userspace NBD translator too.
<kentonv>
nbd is basically fuse at the block layer
<mokomull>
Do you preallocate a gajiggaton of /dev/nbd* devices, or are you using the netlink API?
<kentonv>
gajiggaton of devices
<kentonv>
didn't know you could use netlink for this
<mokomull>
it's relatively new
<kentonv>
I think I create 4096 devices at startup and then I have some code for locking them to grains.
<kentonv>
and it's really easy for devices to get permanently stuck so I have some logic to route around stuck ones, it's gross
<kentonv>
hahaha, before I even clicked I was wondering if the patch comes from Facebook
<kentonv>
sure enough
<mokomull>
kentonv: if you still end up with devices permanently stuck, I would absolutely love to hear about it. We've hit some of that after the blk-mq migration because the kyber and deadline scheduler somehow manage to mess with request IDs enough that confuses nbd.
<kentonv>
(I talked to some people there who seemed excited about nbd recently)
<mokomull>
I am one of those people there :)
<kentonv>
oh hah
<mokomull>
although my crazy patchset hasn't progressed beyond the "rewrite it before you publish this or you're gonna get skewered" stage
<kentonv>
it's been years since I wrote the code, I'm sure it has gotten better
<mokomull>
There was quite the onslaught of XFS fixes when we started this, that's for sure
<kentonv>
mokomull, I've been using ext4 and it's been remarkably solid. I don't think I ever saw an instance of an unrecoverable volume or data loss caused by ext4, even though we disconnect mid-stream all the time.
<mokomull>
that might say some things about our design choices :)
<kentonv>
I'm sure you push a hell of a lot more bits though
<mokomull>
I don't even know how many bits anymore. It's kind of mindblowing.