azonenberg changed the topic of #scopehal to: libscopehal, libscopeprotocols, and glscopeclient development and testing | | Logs:
Degi_ has joined #scopehal
Degi has quit [Ping timeout: 240 seconds]
Degi_ is now known as Degi
* GenTooMan sighs a bit frustrated.
<GenTooMan> hmm the big issue / difference is the reliance on the waveform descriptors in the code. I've written down the execution sequence in the acquisition code so as I can try to make something similar.
<azonenberg> Does the other scope not have a wavedesc? or is it a different format? or what
<GenTooMan> wave descriptors aren't supported, so it's not a matter of format.
<azonenberg> So do you have to manually query all of the settings then?
<azonenberg> there's no header along with the data?
<GenTooMan> the example they give (returned data) sent "C1:WF? DAT2" response "C1:WF DAT2,#9000000070"<binary data>"\n\n" which indicates 70 bytes of data sent
<GenTooMan> fortunately you just need to give 1 command to request wavedata (good) but it's not all good you have to read voltage per division offset Time division and sample rate separately (each a single command)
<azonenberg> you should be able to cache at least
<GenTooMan> so yes cache is probably OK, also those commands can just be banged out and read later. So it's not the same but workable.
<azonenberg> yeah so AcquireData() will need some major retooling
Famine- has joined #scopehal
Famine has quit [Ping timeout: 260 seconds]
<d1b2> <mubes> That request response looks very similar to the request response on the that's something.
<azonenberg> I mean, #9000000070 is just the standard SCPI block header
<azonenberg> you'll see that anywhere
<_whitenotifier-3> [scopehal-apps] azonenberg pushed 3 commits to master [+0/-0/±6]
<_whitenotifier-3> [scopehal-apps] azonenberg 486d8cc - OscilloscopeWindow: memory map files during loading rather than wasting time with redundant copies
<_whitenotifier-3> [scopehal-apps] azonenberg f3920ed - Identify waveforms that are dense packed even if stored as sparse
<_whitenotifier-3> [scopehal-apps] azonenberg 571d3d3 - Dense packed waveforms are now saved in a new format that doesn't waste space on offset or duration. Fixes #92. Fixes #312.
<_whitenotifier-3> [scopehal-apps] azonenberg closed issue #312: Save/restore dense pack flag in waveform metadata -
<_whitenotifier-3> [scopehal-apps] azonenberg closed issue #92: Add support for "dense" waveform storage format (no explicit offset/len) to reduce size of analog captures -
<azonenberg> Well that's nice progress
<azonenberg> ~5x reduction in disk usage and ~2x reduction in file load/save time for non-sparse waveform captures
Tost has joined #scopehal
<d1b2> <theorbtwo> Nice.
<azonenberg> And i'm now seeing an 82% speedup in waveform rendering with some new (not yet pushed) improvements
<d1b2> <theorbtwo> I take it "dense" means that the spacing between samples, format of samples, etc, does not change during the length of the capture?
<_whitenotifier-3> [scopehal-apps] azonenberg pushed 3 commits to master [+2/-0/±13]
<_whitenotifier-3> [scopehal-apps] azonenberg 8918279 - Initial implementation of dense pack optimizations for waveform rendering. About 26% speedup for a dense 128M point waveform. Fixes #328 but probably still more room to tweak.
<_whitenotifier-3> [scopehal-apps] azonenberg 4a83a23 - Dense packed waveform shader now computes indexes locally
<d1b2> <theorbtwo> Doesn't seem too restrictive.
<_whitenotifier-3> [scopehal-apps] azonenberg 4fc6932 - Switched local structure in compute shader to 1x32 instead of 2x16 for significant (44%) speedups due to better GPU occupancy
<_whitenotifier-3> [scopehal-apps] azonenberg closed issue #328: Add support for dense packed waveforms to rendering shaders -
<azonenberg> More specifically it means that the sample offsets are all 0... N-1 timebase units and durations are all 1
<azonenberg> None of this breaks sparse waveform rendering, which is critical for things like measurements, math functions, CSV import, and protocol decodes
<azonenberg> But the common case of looking at a waveform right off a scope, or basic math where each sample maps 1:1 from input to output, is a lot faster now
<azonenberg> And uses less disk space
<azonenberg> I'm still at only 23.1% of theoretical performance on the texture cache and 21.7% on the SM according to NSight though. So i can probably squeeze a lot more performance out of the GPU still
<azonenberg> But over the course of tonight I turned the same data from 9.7 to 2.0 GB on disk
<azonenberg> and i can render a 128 million point waveform in 154 ms instead of 281
<azonenberg> With full intensity grading and all points on screen at once
<d1b2> <theorbtwo> ...and it is smart enough to notice when things didn't "have to" be regularly spaced, but "just happen that way"? Csv imports that have even spacing, protocols that work in constant-sized units?
<d1b2> <theorbtwo> Does it pack to the nearest byte of data length? Is there an equivalent for binary forms that is bit-packed?
<azonenberg> Length for all samples is in timebase units
<azonenberg> the Waveform format is vector<int64> offset, vector<int64> duration, vector<T> samples
<azonenberg> where T=float for analog, bool for digital, or arbitrary class type for protocol data
<azonenberg> There is a flag for "dense packed" in the waveform that I set for anything coming off a scope, loaded from files under some conditions, or processed from dense packed inputs using functions that don't resample
<d1b2> <theorbtwo> Yow. Either the time or space savings would be great. Both at once is awesome!
<azonenberg> as far as CSV imports, tht gets tricky because of rounding errors
<azonenberg> right now i actually do some histogram magic to guess whether the sample rate was intended to be uniform
<azonenberg> so i allow a slight variance in sample rate and forcibly normalize it
<d1b2> <theorbtwo> Aha, I thought T was more arbitrary then that.
<azonenberg> Well for protocol data it can be whatever
<azonenberg> but for analog waveforms it's floating point volts (or other Y axis unit like Hz for a FFT, etc)
<azonenberg> for, say, an ethernet frame a T is an EthernetSample
<azonenberg> which contains a type field and possibly a byte of data
<azonenberg> or several i think
<azonenberg> iirc source/dest mac, ethertype, etc are all single EthernetSample objects
<azonenberg> generally speaking there's a 1:1 mapping from scopehal sample objects to the little boxes you see in the protocol decode view
<azonenberg> with offset=left edge and offset+duration = right edge
<d1b2> <theorbtwo> I thought it was more like waveforms had a unit field, a multiplier field, and the data would be the narrowest type that fits the number of bits we have from the input device.
<azonenberg> Ok let me back up a bit
<azonenberg> There's a bit of other metadata involved
<azonenberg> first, there's a 128-bit timestamp
<azonenberg> a 64-bit time_t (seconds since jan 1 1970, midnight UTC) plus 64-bit femtoseconds since the last second
<azonenberg> There's a boolean flag indicating the waveform is dense packed. This is an optimization only, you still have to have timestamp and duration values present since not every filter includes these optimizations
<azonenberg> But for example if you're a filter whose output is currently 16M samples long and dense packed, and your next output is also 16M samples long and dense packed, you never have to touch them during the update cycle
<azonenberg> Then there's a time scale, which is femtoseconds per time tick
<d1b2> <theorbtwo> Ah, so memory usage doesn't go down. Shame, but not a huge one.
<azonenberg> so offset=5 means the sample begins at 5*timescale fs
<azonenberg> And then you have the vectors of offsets, durations, and sample data
<azonenberg> On disk, the metadata data is serialized as a YAML file called scope_%d_metadata.yml
<azonenberg> then the samples and, if not dense packed, offset and duration are serialized to scope_%d_waveforms/waveform_%d/channel_%d.bin
<azonenberg> or channel_%d_stream%d.bin if you have a multi-stream channel like an I/Q channel
<azonenberg> Right now for sparse waveforms offset, duration, and sample data are interleaved
<azonenberg> this dates back to earlier versions of scopehal which used an array-of-structs representation in ram rather than struct-of-arrays
<azonenberg> the latter being more cache/SIMD friendly
<azonenberg> So my plan is to transition the on-disk sparse format to be all offsets, all durations, and all samples in consecutive blocks
<azonenberg> this will let me mmap the file and memcpy directly into a Waveform object
<azonenberg> or hypothetically even have a disk-backed Waveform that can be paged out
<azonenberg> but that's not currently supported
<azonenberg> That format will be called sparsev2, to distinguish it from the current sparsev1 format which will continue to be readable by future glscopeclient versions
<azonenberg> but once sparsev2 is implemented for writing, writing of sparsev1 will be removed
<d1b2> <theorbtwo> Sounds lovely. Maybe make it so you can have pointers into the mmap area so you can let the os page?
<azonenberg> Generally speaking i maintain strict forward compatibility, a file generated years ago should still open and process fine in the latest version
<azonenberg> but there's no expectation of the reverse being true
<azonenberg> and my main use case for paging would be handling deep history
<d1b2> <theorbtwo> Oh. That would need the os to handle gpu fetching info from disc via mmap, which seems...fragile.
<azonenberg> And no
<azonenberg> the Waveform object is explicitly copied to GPU memory at render time
<d1b2> <theorbtwo> Seems a good policy.
<azonenberg> Longer term I plan to support simultaneous CPU and GPU side copies of the data
<azonenberg> for when i start doing more OpenCL stuff
<azonenberg> right now when doing OpenCL waveform processing data is copied to GPU at the start of each filter operation then back off
<azonenberg> which can lead to inefficiency and more pcie traffic than necessary
<_whitenotifier-3> [scopehal] azonenberg opened issue #462: Refactor Convert*BitSamples() out into base Oscilloscope class -
<_whitenotifier-3> [scopehal] azonenberg labeled issue #462: Refactor Convert*BitSamples() out into base Oscilloscope class -
<_whitenotifier-3> [scopehal] azonenberg labeled issue #462: Refactor Convert*BitSamples() out into base Oscilloscope class -
<_whitenotifier-3> [scopehal] azonenberg opened issue #463: Pico: Optimize waveform conversion -
<_whitenotifier-3> [scopehal] azonenberg labeled issue #463: Pico: Optimize waveform conversion -
<noopwafel> the pico bridge isn't a submodule of the repo right?
<azonenberg> Correct, the bridge is fully separate at the moment
<azonenberg> the intent was to avoid any pico sdk dependencies on libscopehal and glscopeclient at compile time
<azonenberg> you can build libscopehal on any machine you want and it knows how to talk to the bridge
<noopwafel> Successfully opened instrument Driver version: PS3000A Linux Driver,
<noopwafel> so definitely no problems with linking a bunch of different pico libs in
<noopwafel> <- so it's in the same kind of state as my previous bridge was now, which is to say it's totally broken but at least you can see data
<noopwafel> and I didn't even ifdef the code, I just added a g_pico_type, so I guess that is a perfectly fine approach
<noopwafel> how is zooming meant to work btw? I'm on laptop here which is maybe not best, but if I zoom in/out by using mousewheel on the header, then I quickly end up at an impossible point (like, +10000s)
<noopwafel> I had this problem previously too, but I don't think I asked here
juli9610 has joined #scopehal
Tost has quit [Ping timeout: 240 seconds]
bvernoux has joined #scopehal
<azonenberg> noopwafel: Clamping offset to "approximately around the waveform" is probably not a bad idea when zooming, file a ticket?
<azonenberg> It's supposed to center wherever your cursor is
<azonenberg> but in general the touchscreen/touchpad frienliness has room for improvement
<_whitenotifier-3> [scopehal-pico-bridge] noopwafel forked the repository -
<noopwafel> is the PoC diff on the bridge side, I only made minimal changes on the scopehal side
<azonenberg> Are 3001/6001 supposed to mean anything? would probably be cleaner to use enums
<noopwafel> yes it should be an enum :)
<noopwafel> there's a bunch of things to think about in there
<azonenberg> also the general project coding style convention is curly braces on their own line
<noopwafel> yeah, to be clear: not a PR yet :D
<azonenberg> Yeah I know, just giving you feedback now before you write too much code that you have to rework :)
<azonenberg> So what's "totally broken"?
<azonenberg> And how's performance?
<azonenberg> BTW i've seen a failure mode where sometimes the 6000 series falls back to much slower data transfer rates, acting like it's on usb2
<azonenberg> i dont know what causes it and if it's host or device side
<azonenberg> but the workaround is to unplug and replug the usb cable
<azonenberg> then it starts using usb3 again
<noopwafel> so it quickly gets unusably slow for me
<noopwafel> but I think might be unrelated
<noopwafel> it's just running on my laptop with integrated graphics and that is Not Good Enough
<noopwafel> I was going to say it kept dying, but actually it is no longer doing so, I think fixed
<noopwafel> right now, looking at trigger level
<noopwafel> my old bridge hack set the trigger to 'triggerlevel / voltage * 0x7f00'
<noopwafel> and I thought your ps6000a code did the same, but then it is not working. hm.
<noopwafel> will force-push updates to for now
<azonenberg> are you changing offset?
<noopwafel> current status, terminate called after throwing an instance of 'std::bad_alloc', but I grab dinner before I grab gdb :)
<azonenberg> trigger level is in raw adc codes
<azonenberg> so if you add an offset you have to correct in the trigger level
<noopwafel> trigger seems to work now
<noopwafel> I .. don't know what was going wrong.
<noopwafel> design-wise: I am tempted to drop the g_range_3000a because it matches the API enum values for 6000a
<azonenberg> if they're the same, go for it
<noopwafel> just casting and assume they don't change the API is kind of ugly, but I don't think it makes sense to keep them separate for all the scopes
<azonenberg> as far as slow graphics did you pull my latest rendering code from last night?
<azonenberg> There are *major* speedups in there
<azonenberg> like almost double
<noopwafel> I did. It does seem to be GPU-limited, at least it's spending most of the time in a DRM ioctl(). I'll probably just move to a machine with an actual GPU later.
<azonenberg> Improving performance is definitely still a to-do item, i've spent (as you can see) a fair bit of time on rendering performance already and there's still room to imporve
<azonenberg> If you zoom in on the time scale rather than looking at the whole waveform at once does it get faster?
<azonenberg> the shaders get a lot faster if they have to process less samples but everything else still works on the full sample dataset
<azonenberg> So if it gets faster when you zoom you know it's the rendering shader at fault
<noopwafel> yes, it is the rendering shader :)
<noopwafel> or, it seems so! I'll investigate after food
<_whitenotifier-3> [scopehal-apps] azonenberg labeled issue #338: Add speed optimized rendering shader with no intensity grading for low end GPUs -
<_whitenotifier-3> [scopehal-apps] azonenberg opened issue #338: Add speed optimized rendering shader with no intensity grading for low end GPUs -
<_whitenotifier-3> [scopehal-apps] azonenberg labeled issue #338: Add speed optimized rendering shader with no intensity grading for low end GPUs -
elms has quit [Ping timeout: 260 seconds]
esden has quit [Ping timeout: 260 seconds]
elms has joined #scopehal
esden has joined #scopehal
* xzcvczx didn't realise azonenberg was a dirty decapper
<GenTooMan> I'm sure he bathes xzcvczx so he's a CLEAN decapper
<xzcvczx> with the nitric acid he uses to bathe it sure as hell aint gonna be clean
<GenTooMan> nitric is weird acid. It has to be diluted from it's concentrated form to .. actually work.
<GenTooMan> still I'm sure he himself isn't dirty.
<xzcvczx> it is rather funny looking at stuff and finding people you know (of) randomly being called out
ericonr has quit [Ping timeout: 240 seconds]
ericonr has joined #scopehal
<GenTooMan> no person is an island as they say.
<GenTooMan> you mean funny as in "hah I heard that pseudonym before!" not as in odd I presume also.
<GenTooMan> erstwhile which scope are you attempting to get to work?
<xzcvczx> GenTooMan: indeed
juli9610 has quit [Quit: Nettalk6 -]
<azonenberg> GenTooMan: actually its interesting, when nitric is really strong it behaves more like an oxidizer than an acid
<azonenberg> The oxidizing nature is what's used in decapping
<azonenberg> if it gets diluted it starts acting acid-like and corroding metal
<azonenberg> if really strong it will oxidize the organics in the chip package but passivate the metal
<sorear> rfna territoriy?
<azonenberg> Yeah
<azonenberg> RFNA/WNFA are predominantly oxidizers, 70% is right on the line where the acidic nature begins to dominate the oxidizing nature in this type of situation
<_whitenotifier-3> [scopehal-apps] fsedano commented on issue #325: GPU hang on iris Plus driver -
<d1b2> <fsedano> @azonenberg GPU crash on iris happened again - We need to reopen it, something else is going on
<_whitenotifier-3> [scopehal] azonenberg opened issue #464: CUDA / cuFFT support -
<_whitenotifier-3> [scopehal] azonenberg labeled issue #464: CUDA / cuFFT support -
<_whitenotifier-3> [scopehal-apps] azonenberg reopened issue #325: GPU hang on iris Plus driver -
<GenTooMan> azonenberg, they use to use Aniline and RFNA for fuel in Nike Rockets as it was a pyrogolic mixture.
<sorear> is there anyone here who _hasn't_ read the relevant chapter of _Ignition_ because there may not be a point in discussing things that are already common knowledge
<GenTooMan> sorear, hmm probably not in this area of IRC
<azonenberg> fsedano: lovely
<azonenberg> Did you try my code from last night btw? is it faster at least?