azonenberg changed the topic of #scopehal to: libscopehal, libscopeprotocols, and glscopeclient development and testing | https://github.com/azonenberg/scopehal-apps, https://github.com/azonenberg/scopehal, https://github.com/azonenberg/scopehal-docs | Logs: https://freenode.irclog.whitequark.org/scopehal
<azonenberg> ooook so this is gonna be an adventure
<azonenberg> I'm going to try and record a second or more of activity on this DUT
<azonenberg> by booting it up with a trigger on the power input
<azonenberg> then sweeping trigger offset to grab 200ms or so at a time of activity
<azonenberg> If i'm lucky it will be a reproducible enough boot process i can reconstruct everything
<azonenberg> Gonna be multiple gigs of data i think
<miek> "use your sampling scope on any waveform with this one weird trick!"
juli965 has quit [Quit: Nettalk6 - www.ntalk.de]
<azonenberg> Also having signal integrity problems i need to work on...
<azonenberg> I didnt realize the buses were this fast
<azonenberg> i might have to switch to somethign else instead of the LA pod i have now
<azonenberg> or at least shorten my probes by a lot
<azonenberg> I would love a solder in AKL-PT1 on maxwell for this
<azonenberg> lain: soooo i now have glscopeclient using 119.9 GB of RAM
<miek> welp
<azonenberg> Eleven waveforms, each 64M points on 7 channels
<azonenberg> A total of 2.8 *seconds* of data at 250 Msps
<azonenberg> That's 704M points per channel or nearly 5 G points of raw data
<azonenberg> i'm now in the process of saving it to my NAS lol. The gig-e pipe has been saturated for quite a while
<azonenberg> In case you were wondering why i was planning to build a 10G-attached ceph SAN :p
<azonenberg> also RLE at the capture side would greatly reduce the overhead since a lot of the downtime between transactions would use less memory if it wasn't regularly spaced
<azonenberg> I might actually implement that even for the e.g. lecroy driver at some point
<azonenberg> only emit samples on the LA if the data has changed
<lain> azonenberg: lol
<lain> :D
<azonenberg> (and it's still storing the file)
<azonenberg> The save file is 80 GB
<lain> now gzip it
<azonenberg> lolno
<lain> :D
<lain> I wonder if lz4 would do any good, it's quite fast
<azonenberg> I'm also wondering whether the original seagate SSD i was looking at is the best choice for this kind of work
<azonenberg> Seagate XP1920LE30002
<azonenberg> M.2 22110, 1920 GB, PCIe 3.0 x4, has power loss protection, but only rated for 0.3 DWPD
<azonenberg> Which is, to be fair, still 576 GB/day of writes for 5 years
<azonenberg> Also not suuuper fast, they claim sequential read/write of 2000 / 1200 MB/s
<azonenberg> But it's cheap for the capacity, $200 at newegg
<azonenberg> Something like the samsung 983 DCT would give me the same capacity and form factor but cost $412. For that price i get 0.8 DWPD endurance and 3000 / 1430 MB/s read/write
<azonenberg> or if i go all out, the intel dc P3608 is $1060, 3 DWPD, 5000/2000 MB/s read/write, in a hh/hl pcie form factor
<azonenberg> Assuming i have a single 10GbE pipe to the box, the aggregate bandwidth available from all four SSDs in one server will only be 1250 MB/s to the outside world so i feel like paying more for higher performance is unnecessary
<azonenberg> And endurance probably isn't a HUGE concern because even if i'm writing lots of big scope captures they'll be spread across multiple drives
<azonenberg> Ignoring write amplification issues (probably not a huge deal for massive linear file writes like storing waveform data), the seagate can do 576 GB/day of writes and if i split that across four drives (assuming 4 drives per node, 3x replication, 3 nodes)
<azonenberg> my actual write capacity over 5 year is 2304 GB/day across the array
Degi has quit [Ping timeout: 246 seconds]
<azonenberg> that's 28 of these giant datasets
<azonenberg> each day, every day
<azonenberg> lain: what do you think? i feel like there's little chance of me exceeding that
<sorear> do you actually need long term storage of oscilloscope traces and by extension do you actually need to raid >0 them? is losing all of your traces going to cost you more than an hour of work?
<azonenberg> sorear: so the issue is mainly, this isn't JUST for scope waveforms
<azonenberg> My plan is to set up a snigle storage cluster to serve all of my needs
Degi has joined #scopehal
<azonenberg> I'm going to have a bunch of ceph RBD block devices for virtual machine hard drives (replacing the hard wired M.2's in my current VM server which is getting full and also a bit old)
<azonenberg> as well as a CephFS filesystem for storing home directories, scope traces, all of my photos and media files,
<azonenberg> the plan is to consolidate all of my storage so each machine has a boot drive and a 1G/10G/40G (depending on throughput requirements) ethernet link to the ceph cluster
<azonenberg> and the boot drive will be essentially disposable, if it fails i reimage and copy over a few dotfiles
<azonenberg> the main storage array will be backed up nightly to my existing offsite backup server (6x 4TB spinning-rust in raid6)
<azonenberg> But it's located in another city and restoring that much data over VPN to the other location would be super time consuming. So i want to minimize the chances of downtime
<azonenberg> also, ceph's replication isn't just for drive failure tolerance. it can do scrubbing etc to ensure data integrity
<azonenberg> and having multiple copies of data means reads can be serviced from any copy
<azonenberg> so it's faster
<sorear> not gonna try to netboot? :p
<azonenberg> No
<azonenberg> PXE is a giant pain in the butt to set up and maintain
<azonenberg> I've done it
<kc8apf> Ceph has a fairly high maintenance burden
<azonenberg> kc8apf: oh?
<kc8apf> And it really doesn't like small clusters
<azonenberg> i'm still open to ideas. And from what folks are telling me 3 nodes is the smallest that is reasonable to go, but it should work fine on that
<kc8apf> I ran a 3 node setup for a while and it mostly worked
<azonenberg> basically i want something that scales to more capacity and bandwidth than just a couple of drives with linux mdraid served over nfs
<azonenberg> if not ceph it would be lustre or pvfs or something like that
<kc8apf> Colocating the ceph control plane with an osd causes odd, hard to debug problems when the osd load gets high
<kc8apf> Basically none of the existing options are particularly pleasant to use
<azonenberg> What do you mean "colocating"
<azonenberg> same drive or same cpu?
<kc8apf> CPU
<azonenberg> I figured if i had six cores and four OSDs i'd be OK
<kc8apf> Certain events can cause a fairly high cpu load doing reconstruction or verification
<azonenberg> and how high load are we talking? i'm only going to have 10GbE to each node
<azonenberg> no 40/100G although my workstation will have 40G to the network core
<azonenberg> my hope is to be able to hit 30G throughput from my workstation to the three ceph nodes
<kc8apf> I had 1G and 1 osd per node
<azonenberg> what kind of cpu?
<kc8apf> You can get bandwidth from ceph but it requires some planning and careful reading of the tuning guides
<azonenberg> My proposed build right now in a newegg wishlist has a 6 core 1.7 GHz skylake xeon (scalable bronze 3104)
<azonenberg> and the way i see it is, almost anything i do with ceph is likely to outperform my current mdraid 2x 7200rpm 4tb NFS over gig-e
<azonenberg> it's just a question of how much better i can make it
<kc8apf> I was using fairly low end stuff. HDDs and older AMD boxes
<azonenberg> was ram bandwidth/capacity an issue for you?
<azonenberg> i'm looking at 24GB of 6 channel ddr4 2666 per node
<kc8apf> Plan for 1-2GB of RAM per TB of storage
<kc8apf> Ceph and bluestore like to cache a lot
<azonenberg> Yeah. I'm looking at 4x 1.92 TB OSDs per node
<azonenberg> so 8TB total and 24GB of RAM
<sorear> osd?
<kc8apf> OSD is the ceph per-disk process
<sorear> ceph's high level design seems vastly preferable to every other network filesystem but some of the things I've heard about administering it are alarming
<kc8apf> That's pretty much spot on
<azonenberg> lol
<azonenberg> well, i figure i'll give it a try and keep the old nfs server around for a little while
<azonenberg> and if it proves to be too annoying i'll move all my data back to the old server
<azonenberg> then wipe the ceph nodes and just run nfs on them
<azonenberg> But i won't have budget for a while as i just bought the 4 GHz scope and now that my pocket has recovered from that, i need to save up for some repairs around the house before i invest in more lab infrastructure
<azonenberg> And i also have to upgrade sonnet still
<azonenberg> oh, and build MAXWELL lol
azonenberg_work has quit [Ping timeout: 256 seconds]
azonenberg_work has joined #scopehal
maartenBE has quit [Ping timeout: 256 seconds]
maartenBE has joined #scopehal
<azonenberg> Yaaaay
<azonenberg> I just segfaulted glscopeclient with an 80GB dataset in RAM
<azonenberg> I saved it to disk previously, thankfully, but loading it will take 10+ minutes
<azonenberg> This sounds like a good excuse to implement the file load progress dialog i've wanted to have :p
<azonenberg> After i fix another bug that is
<monochroma> XD
<azonenberg> Also i need that storage cluster sooner rather than later
<azonenberg> at 30 Gbps, that would be a 20 second load time
<electronic_eel> azonenberg: so you are planning to have just the os on local disks connected to your workstations and the full homedir will be on ceph?
<electronic_eel> I'm doing something similar, but with nfs. but when I introduced this, I ran into massive latency problems on the workstations: many programs tend to do lot's of accesses to ~/.local and ~/.config - and all theses small accesses always have the full network latency
<electronic_eel> so basically the affected programs became slow as hell
<sorear> ceph has a cache/lease mechanism that's not pure yolo like nfs's
<electronic_eel> what I ended up doing is introducing a cache partition which rsyncs .config and .local on login and syncs it back on a clean logout
<electronic_eel> sorear: ah, good to hear. I'm running this setup for like 8+ years now, maybe the cache on nfs got better over the years, I haven't investigated
<_whitenotifier-f> [scopehal] azonenberg pushed 1 commit to master [+0/-0/±1] https://git.io/JUZnQ
<_whitenotifier-f> [scopehal] azonenberg 7f6da2a - Removed stray newline
<azonenberg> electronic_eel: so what i do right now is i have my homedir with .local and .config be stored on the OS drive
<azonenberg> but /nfs4/home/azonenberg/ has all of my data
<azonenberg> i consider it kinda my pseudo-home
<azonenberg> /home/azonenberg/ has very little on it
<electronic_eel> do you sometimes sync /home/az to your file server for backup?
<azonenberg> i expect /ceph/home/azonenberg/ will also be used for pretty much all of my bulk data and /home/azonenberg will contain preference settings like i do now
<azonenberg> No. I consider everything there fairly expendable
<azonenberg> the preferences i care more about are things like browsing history
<azonenberg> Which are stored in the web browsing VM
<azonenberg> Which is not backed up, but is on raid1 on the xen server
<azonenberg> and will eventually be stored on ceph
<electronic_eel> hmm, setting up all the programs takes quite some time for me, so I really want it in the backup
<azonenberg> Yeah makes sense
<azonenberg> Copying it to a backup would not be unreasonable, or just running a clientside backup
<azonenberg> Right now my NAS is the only machine that actually has backups on it, as everything i really care about lives there
<_whitenotifier-f> [scopehal-apps] azonenberg pushed 1 commit to master [+0/-0/±4] https://git.io/JUZnx
<_whitenotifier-f> [scopehal-apps] azonenberg 67ea814 - Fixed bug where clicking a trace in the protocol analyzer from a historical waveform moved to the new history point but did not properly reload cached waveforms for display
<electronic_eel> do you compile on your /nfs4/home/azonenberg/?
<electronic_eel> compiling also has a really noticeable latency for me, so I usually compile on a local disk
<azonenberg> electronic_eel: yes i do
<azonenberg> when i "make -j" kicad or glscopeclient i can max out all 32 cores without getting network bound already
<azonenberg> i hide latency by just having another gcc instance steal the cpu while the first one is blocked lol
<electronic_eel> hmm, strange. depending on file count the compile times can be twice to 10 times worse for me
<azonenberg> last time i checked i could compile all of kicad in something like... 90 seconds?
<azonenberg> on nfs
<azonenberg> what's the latency from you to the nas? is it wifi?
<electronic_eel> nonono
<electronic_eel> 10gbe over direct attach cable
<azonenberg> I have something on the order of 100-150 μs latency to it
<azonenberg> 10Gbase-SR from me to the core switch, 1000base-SX from there to the edge switch, then 1000base-T from the edge switch to the NAS
<electronic_eel> maybe it is the kernel version on the server that is getting old, it is running centos 7
<azonenberg> My proposed ceph cluster will connect to the core switch via 10Gbase-SR from each node
<electronic_eel> plan to migrate to centos 8 soon
<azonenberg> and the connection to my desk will be upgraded to 40gbase-SR4
<azonenberg> I have the cable in the walls already
<azonenberg> but got screwed over by MPO connectors
<azonenberg> turns out MPO keystone modules do not contain alignment pins
<azonenberg> i.e. you cannot mate two female MPOs with one
<azonenberg> you need a male on one cable
<azonenberg> Aaaand all of my MTP/MPO patch cords have female on both sides
<azonenberg> So now i have some MTP M-F 1.5m patch cords on the way from FS but they won't be here until october 1st or thereabouts
<azonenberg> At which point i should have 40GbE to the core
<azonenberg> I'm used to 10G stuff where all cables have male LC ends and all couplers contain the necessary alignment features to mate two male cables end to end
<electronic_eel> ok, note to myself that I have to look this stuff up in detail before introducing 40g
<azonenberg> There are also 3 different polarities for MPO/MTP connectors (to clarify, MTP is a brand-name connector which is compatible with the generic MPO but has some nice features)
<azonenberg> or well for cables
<azonenberg> A is straight through
<azonenberg> B is L-R crossover, C is pairwise crossover and i think is pretty rare
<azonenberg> My standard is to use type B cables for everything, which aligns with how normal LC cables are crossover
<azonenberg> So if you have two patch cords and a plant cable, all type B, you end up with a net of one tx-rx inversion across the whole thing
<azonenberg> which is what you want
<azonenberg> QSFP modules normally have male connections (alignment pins are the only thing that distinguish a M from F, the rest of the connector body is the same and an external bracket is needed to mate a M to an F and hold them together)
<azonenberg> So a QSFP contains both the bracket that the connector latches to as well as a male MPO compatible mating receptacle that you plug a female cable into
<azonenberg> then your patch cords are normally all female at each end, and your plant cables are normally male at both ends
<azonenberg> Which i didn't know, since i was used to plant and patch cords all being the same gender and having couplers between them
<azonenberg> but it turns out MPO couplers are just the bracket and you need the cables to be gender compatible
<azonenberg> There is apparently a tool that lets you insert pins into a female MTP connector, one of the improvements in the name brand MTP vs generic MPO is that it allows field polarity changes
<azonenberg> but this was an expensive cable and i dont want to risk screwing it up and i have no cheap cables to practice on, and the tool isnt cheap either
<azonenberg> i only have two misgendered plant cables so i'm just going to get male ends on pach cords for this one link
<azonenberg> and not make the mistake again
<azonenberg> if this was a larger rollout i'd put pins on the plant cables to avoid problems
<electronic_eel> hmm, is it common to have the MPO stuff in the patch panels and so on? that would mean a dedicated fibre installation just for this. why not use regular lc and a 4 lc to mpo patch cable?
<electronic_eel> I usually want my wiring in the walls and so on be compatible for several generations of connections
<electronic_eel> the cat-7 cables I used in the old office at work were used with 100 mbit first and then were upgraded to 1 gbe. could also be used with 2.5gbe or if I really wanted, 10gbase-t
<electronic_eel> since splicing the fibre stuff tends to be more expensive than pulling & connecting cat7, I really want this to be usable for some years to come
<azonenberg> regular LC for the plant cable you mean?
<azonenberg> i was concerned about skew from manufacturing tolerances between cables
<azonenberg> with a MPO you know every fiber in the cable is the same length
<electronic_eel> "plant cable" is the one that runs in the walls, right?
<azonenberg> yes
<azonenberg> or rack to rack, or generally "infrastructure" vs a patch cord
<electronic_eel> ok
<azonenberg> I'm running 40G from my lab to my desk
<azonenberg> there are three ways to do it
<azonenberg> MPO plant cable, whic his what i did
<azonenberg> 4x LC plant cable, which takes up more space on patch panels and in tray/conduit and might have skew concerns
<electronic_eel> do the 4 lanes need to be phase stable and have tight length tolerances?
<azonenberg> I'm not sure, i didnt look into it
<azonenberg> but i havent found anybody talking about using four LC cables for 40G
<azonenberg> i have no idea what the 40gbase-sr4 lane to lane skew budget is
<azonenberg> all of the MPO to 4x LC breakouts i've seen were for 4x 10G
<azonenberg> anyway, the final option which i did consider was using 1x LC plant cable and a CWDM QSFP+ that runs four wavelengths over a single fiber each way
<azonenberg> But those are around $200 vs $30 per optic
<azonenberg> MPO cables cost a bit more than LC cables but it was still cheaper than using WDM optics
<azonenberg> that only really makes sense IMO if you have a major investment in existing plant cables that arent practical to reinstall
<electronic_eel> yeah, you just do CWDM if you'd have to pull completely new cables otherwise
<azonenberg> well i had originally looked into it because i couldnt find MPO keystones
<azonenberg> i only was able to find a single vendor
<_whitenotifier-f> [scopehal-apps] azonenberg pushed 1 commit to master [+2/-0/±3] https://git.io/JUZWj
<_whitenotifier-f> [scopehal-apps] azonenberg 5474836 - Implemented progress dialog for waveform loading. Not displayed for saving yet. Fixes #165.
<_whitenotifier-f> [scopehal-apps] azonenberg closed issue #165: Add progress dialog when loading files - https://git.io/JUGvL
<_whitenotifier-f> [scopehal-apps] azonenberg opened issue #166: Add progress dialog when saving files - https://git.io/JUZle
<_whitenotifier-f> [scopehal-apps] azonenberg labeled issue #166: Add progress dialog when saving files - https://git.io/JUZle
<azonenberg> well this is nice, lol. I'm watching a youtube video on a browser running in a VM in my xen box, with sound over RTP and video over SSH+VNC, with no noticeable lag
<azonenberg> *while* glscopeclient is saturating the gigabit pipe to my NAS loading an 80GB saved waveform dataset
<azonenberg> About 1.2 Gbps inbound. I think this is the first time the 10G pipe from my desk to the rack has actually run at >1gbps for a sustained period of time
<azonenberg> i've had trouble fully using the 10G pipe because i don't have enough things at the far side of the pipe that can keep up yet
<azonenberg> the VMs rarely push more than a few hundred Mbps although they are on a 10G pipe, and the NAS is slooow
<azonenberg> of course as soon as i build maxwell that will be a different story
<azonenberg> ooh 1.4 Gbps sustained now
<_whitenotifier-f> [scopehal] azonenberg pushed 1 commit to master [+0/-0/±1] https://git.io/JUZ81
<_whitenotifier-f> [scopehal] azonenberg 0f093a6 - Accept both upper and lowercase "k" as prefix for "kilo"
juli965 has joined #scopehal
<_whitenotifier-f> [scopehal] azonenberg pushed 1 commit to master [+0/-0/±1] https://git.io/JUZ41
<_whitenotifier-f> [scopehal] azonenberg 6a554ef - SPIFlashDecoder: implemented 0x0b fast read
jn__ has joined #scopehal
<electronic_eel> about 40gbase and using LC breakouts - this paper claims you can have 15 meters length difference between the lanes without hitting the limits
<electronic_eel> so it seems to me that using lc for your infrastructure cabling and just have mpo to 4x lc breakout cables seems to be no problem
bvernoux has joined #scopehal
<azonenberg> electronic_eel: assuming you have enough patch panel space, yes
<azonenberg> that would be an option
<_whitenotifier-f> [scopehal-apps] azonenberg opened issue #167: Protocol analyzer: color code rows based on type of packet - https://git.io/JUZbu
<_whitenotifier-f> [scopehal-apps] azonenberg labeled issue #167: Protocol analyzer: color code rows based on type of packet - https://git.io/JUZbu
<_whitenotifier-f> [scopehal-apps] azonenberg pushed 1 commit to master [+0/-0/±3] https://git.io/JUZNF
<_whitenotifier-f> [scopehal-apps] azonenberg 19e8b03 - ProtocolAnalyzerWindow: now display per packet background colors. Fixes #167.
<_whitenotifier-f> [scopehal-apps] azonenberg closed issue #167: Protocol analyzer: color code rows based on type of packet - https://git.io/JUZbu
<_whitenotifier-f> [scopehal-apps] azonenberg opened issue #168: Add filter support to protocol analyzer to show/hide packets matching certain properties (f.ex hide status register polling) - https://git.io/JUZxC
<_whitenotifier-f> [scopehal-apps] azonenberg labeled issue #168: Add filter support to protocol analyzer to show/hide packets matching certain properties (f.ex hide status register polling) - https://git.io/JUZxC
<bvernoux> azonenberg, do you plan to fully replace cairo to OpenGL ?
<azonenberg> bvernoux: No. Non performance critical stuff like cursors, axis labels, etc will remain cairo for the indefinite future
<azonenberg> but when you have tens of thousands of protocol decode packets in a view cairo is slow
<bvernoux> ok
<azonenberg> most likely what i will move to near-term is opengl for the colored outlines of protocol events
<bvernoux> I imagine it is not a simple task to rewrite cairo stuff in OpenGL ...
<azonenberg> then cairo for the text inside them
<azonenberg> text in GL is a huge pain and i hide text when the box is too small to fit it
<bvernoux> especially for text ...
<azonenberg> So GL-accelerating the text seems unnecessary
<azonenberg> but the boxes are drawn even when tiny
<azonenberg> So i figure accelerate the boxes then software render text if it fits
<azonenberg> all analog and digital waveform rendering is already done in shaders
<bvernoux> nice
<bvernoux> I have a friend which have done lot of OpenGL stuff in paste and he told me too that text is a huge pain in GL ...
<bvernoux> So I understand now why you mix cairo and GL
<azonenberg> Yeah
<azonenberg> cairo makes antialiasing etc really nice, it produces beautiful output, it's just slow
<azonenberg> and with GL composited rendering is super easy
<azonenberg> So i render some stuff with cairo and some in compute shaders (not using the GL graphics pipeline, just parallel compute)
<azonenberg> splat them all out into textures then render a handful of actual GL triangles and use a fragment shader to merge them
<bvernoux> need to test the Rigol version ;)
<bvernoux> on my old DS1102E just to see how glscope works
<bvernoux> IIRC it shall be compatible even if DS1102E is very slow to send data over USB
<_whitenotifier-f> [scopehal-apps] azonenberg pushed 1 commit to master [+0/-0/±2] https://git.io/JUZpZ
<_whitenotifier-f> [scopehal-apps] azonenberg 11453b8 - ProtocolAnalyzerWindow: laid initial groundwork for display filters (see #168). No actual filtering is performed, but the m_visible bit now controls row visibility.
bvernoux has quit [Quit: Leaving]
juli965 has quit [Quit: Nettalk6 - www.ntalk.de]