#scopehal on 2021-05-07 — irc logs at freenode.irclog.whitequark.org

2020-12-26 14:28 azonenberg changed the topic of #scopehal to: libscopehal, libscopeprotocols, and glscopeclient development and testing | https://github.com/azonenberg/scopehal-apps | Logs: https://freenode.irclog.whitequark.org/scopehal

00:07 Degi_ has joined #scopehal

00:08 Degi has quit [Ping timeout: 240 seconds]

00:08 Degi_ is now known as Degi

02:38 * GenTooMan sighs a bit frustrated.

02:40 <GenTooMan> hmm the big issue / difference is the reliance on the waveform descriptors in the code. I've written down the execution sequence in the acquisition code so as I can try to make something similar.

03:03 <azonenberg> Does the other scope not have a wavedesc? or is it a different format? or what

03:27 <GenTooMan> wave descriptors aren't supported, so it's not a matter of format.

03:29 <azonenberg> So do you have to manually query all of the settings then?

03:29 <azonenberg> there's no header along with the data?

03:32 <GenTooMan> the example they give (returned data) sent "C1:WF? DAT2" response "C1:WF DAT2,#9000000070"<binary data>"\n\n" which indicates 70 bytes of data sent

03:34 <GenTooMan> fortunately you just need to give 1 command to request wavedata (good) but it's not all good you have to read voltage per division offset Time division and sample rate separately (each a single command)

03:35 <azonenberg> you should be able to cache at least

03:42 <GenTooMan> so yes cache is probably OK, also those commands can just be banged out and read later. So it's not the same but workable.

03:44 <azonenberg> yeah so AcquireData() will need some major retooling

04:12 Famine- has joined #scopehal

04:15 Famine has quit [Ping timeout: 260 seconds]

06:17 <d1b2> <mubes> That request response looks very similar to the request response on the sds2k+...so that's something.

06:21 <azonenberg> I mean, #9000000070 is just the standard SCPI block header

06:21 <azonenberg> you'll see that anywhere

07:03 <_whitenotifier-3> [scopehal-apps] azonenberg pushed 3 commits to master [+0/-0/±6] https://git.io/J3Dqr

07:03 <_whitenotifier-3> [scopehal-apps] azonenberg 486d8cc - OscilloscopeWindow: memory map files during loading rather than wasting time with redundant copies

07:03 <_whitenotifier-3> [scopehal-apps] azonenberg f3920ed - Identify waveforms that are dense packed even if stored as sparse

07:03 <_whitenotifier-3> [scopehal-apps] azonenberg 571d3d3 - Dense packed waveforms are now saved in a new format that doesn't waste space on offset or duration. Fixes #92. Fixes #312.

07:03 <_whitenotifier-3> [scopehal-apps] azonenberg closed issue #312: Save/restore dense pack flag in waveform metadata - https://git.io/JOeBG

07:03 <_whitenotifier-3> [scopehal-apps] azonenberg closed issue #92: Add support for "dense" waveform storage format (no explicit offset/len) to reduce size of analog captures - https://git.io/JflHo

07:03 <azonenberg> Well that's nice progress

07:03 <azonenberg> ~5x reduction in disk usage and ~2x reduction in file load/save time for non-sparse waveform captures

07:07 Tost has joined #scopehal

08:31 <d1b2> <theorbtwo> Nice.

08:37 <azonenberg> And i'm now seeing an 82% speedup in waveform rendering with some new (not yet pushed) improvements

08:38 <d1b2> <theorbtwo> I take it "dense" means that the spacing between samples, format of samples, etc, does not change during the length of the capture?

08:39 <_whitenotifier-3> [scopehal-apps] azonenberg pushed 3 commits to master [+2/-0/±13] https://git.io/J3Dui

08:39 <_whitenotifier-3> [scopehal-apps] azonenberg 8918279 - Initial implementation of dense pack optimizations for waveform rendering. About 26% speedup for a dense 128M point waveform. Fixes #328 but probably still more room to tweak.

08:39 <_whitenotifier-3> [scopehal-apps] azonenberg 4a83a23 - Dense packed waveform shader now computes indexes locally

08:39 <d1b2> <theorbtwo> Doesn't seem too restrictive.

08:39 <_whitenotifier-3> [scopehal-apps] azonenberg 4fc6932 - Switched local structure in compute shader to 1x32 instead of 2x16 for significant (44%) speedups due to better GPU occupancy

08:39 <_whitenotifier-3> [scopehal-apps] azonenberg closed issue #328: Add support for dense packed waveforms to rendering shaders - https://git.io/J3OMm

08:40 <azonenberg> More specifically it means that the sample offsets are all 0... N-1 timebase units and durations are all 1

08:40 <azonenberg> None of this breaks sparse waveform rendering, which is critical for things like measurements, math functions, CSV import, and protocol decodes

08:41 <azonenberg> But the common case of looking at a waveform right off a scope, or basic math where each sample maps 1:1 from input to output, is a lot faster now

08:41 <azonenberg> And uses less disk space

08:41 <azonenberg> I'm still at only 23.1% of theoretical performance on the texture cache and 21.7% on the SM according to NSight though. So i can probably squeeze a lot more performance out of the GPU still

08:42 <azonenberg> But over the course of tonight I turned the same data from 9.7 to 2.0 GB on disk

08:42 <azonenberg> and i can render a 128 million point waveform in 154 ms instead of 281

08:43 <azonenberg> With full intensity grading and all points on screen at once

08:43 <d1b2> <theorbtwo> ...and it is smart enough to notice when things didn't "have to" be regularly spaced, but "just happen that way"? Csv imports that have even spacing, protocols that work in constant-sized units?

08:45 <d1b2> <theorbtwo> Does it pack to the nearest byte of data length? Is there an equivalent for binary forms that is bit-packed?

08:45 <azonenberg> Length for all samples is in timebase units

08:45 <azonenberg> the Waveform format is vector<int64> offset, vector<int64> duration, vector<T> samples

08:46 <azonenberg> where T=float for analog, bool for digital, or arbitrary class type for protocol data

08:46 <azonenberg> There is a flag for "dense packed" in the waveform that I set for anything coming off a scope, loaded from files under some conditions, or processed from dense packed inputs using functions that don't resample

08:46 <d1b2> <theorbtwo> Yow. Either the time or space savings would be great. Both at once is awesome!

08:46 <azonenberg> as far as CSV imports, tht gets tricky because of rounding errors

08:47 <azonenberg> right now i actually do some histogram magic to guess whether the sample rate was intended to be uniform

08:47 <azonenberg> so i allow a slight variance in sample rate and forcibly normalize it

08:47 <d1b2> <theorbtwo> Aha, I thought T was more arbitrary then that.

08:48 <azonenberg> Well for protocol data it can be whatever

08:48 <azonenberg> but for analog waveforms it's floating point volts (or other Y axis unit like Hz for a FFT, etc)

08:48 <azonenberg> for, say, an ethernet frame a T is an EthernetSample

08:48 <azonenberg> which contains a type field and possibly a byte of data

08:48 <azonenberg> or several i think

08:49 <azonenberg> iirc source/dest mac, ethertype, etc are all single EthernetSample objects

08:49 <azonenberg> generally speaking there's a 1:1 mapping from scopehal sample objects to the little boxes you see in the protocol decode view

08:49 <azonenberg> with offset=left edge and offset+duration = right edge

08:55 <d1b2> <theorbtwo> I thought it was more like waveforms had a unit field, a multiplier field, and the data would be the narrowest type that fits the number of bits we have from the input device.

08:55 <azonenberg> Ok let me back up a bit

08:55 <azonenberg> There's a bit of other metadata involved

08:55 <azonenberg> first, there's a 128-bit timestamp

08:56 <azonenberg> a 64-bit time_t (seconds since jan 1 1970, midnight UTC) plus 64-bit femtoseconds since the last second

08:56 <azonenberg> There's a boolean flag indicating the waveform is dense packed. This is an optimization only, you still have to have timestamp and duration values present since not every filter includes these optimizations

08:57 <azonenberg> But for example if you're a filter whose output is currently 16M samples long and dense packed, and your next output is also 16M samples long and dense packed, you never have to touch them during the update cycle

08:57 <azonenberg> Then there's a time scale, which is femtoseconds per time tick

08:58 <d1b2> <theorbtwo> Ah, so memory usage doesn't go down. Shame, but not a huge one.

08:58 <azonenberg> so offset=5 means the sample begins at 5*timescale fs

08:58 <azonenberg> And then you have the vectors of offsets, durations, and sample data

08:59 <azonenberg> On disk, the metadata data is serialized as a YAML file called scope_%d_metadata.yml

08:59 <azonenberg> then the samples and, if not dense packed, offset and duration are serialized to scope_%d_waveforms/waveform_%d/channel_%d.bin

08:59 <azonenberg> or channel_%d_stream%d.bin if you have a multi-stream channel like an I/Q channel

09:00 <azonenberg> Right now for sparse waveforms offset, duration, and sample data are interleaved

09:00 <azonenberg> this dates back to earlier versions of scopehal which used an array-of-structs representation in ram rather than struct-of-arrays

09:00 <azonenberg> the latter being more cache/SIMD friendly

09:01 <azonenberg> So my plan is to transition the on-disk sparse format to be all offsets, all durations, and all samples in consecutive blocks

09:01 <azonenberg> this will let me mmap the file and memcpy directly into a Waveform object

09:01 <azonenberg> or hypothetically even have a disk-backed Waveform that can be paged out

09:01 <azonenberg> but that's not currently supported

09:02 <azonenberg> That format will be called sparsev2, to distinguish it from the current sparsev1 format which will continue to be readable by future glscopeclient versions

09:02 <azonenberg> but once sparsev2 is implemented for writing, writing of sparsev1 will be removed

09:02 <d1b2> <theorbtwo> Sounds lovely. Maybe make it so you can have pointers into the mmap area so you can let the os page?

09:02 <azonenberg> Generally speaking i maintain strict forward compatibility, a file generated years ago should still open and process fine in the latest version

09:02 <azonenberg> but there's no expectation of the reverse being true

09:03 <azonenberg> and my main use case for paging would be handling deep history

09:03 <d1b2> <theorbtwo> Oh. That would need the os to handle gpu fetching info from disc via mmap, which seems...fragile.

09:03 <azonenberg> And no

09:04 <azonenberg> the Waveform object is explicitly copied to GPU memory at render time

09:04 <d1b2> <theorbtwo> Seems a good policy.

09:04 <azonenberg> Longer term I plan to support simultaneous CPU and GPU side copies of the data

09:04 <azonenberg> for when i start doing more OpenCL stuff

09:04 <azonenberg> right now when doing OpenCL waveform processing data is copied to GPU at the start of each filter operation then back off

09:04 <azonenberg> which can lead to inefficiency and more pcie traffic than necessary

10:43 <_whitenotifier-3> [scopehal] azonenberg opened issue #462: Refactor Convert*BitSamples() out into base Oscilloscope class - https://git.io/J3DFz

10:43 <_whitenotifier-3> [scopehal] azonenberg labeled issue #462: Refactor Convert*BitSamples() out into base Oscilloscope class - https://git.io/J3DFz

10:43 <_whitenotifier-3> [scopehal] azonenberg opened issue #463: Pico: Optimize waveform conversion - https://git.io/J3DFK

10:43 <_whitenotifier-3> [scopehal] azonenberg labeled issue #463: Pico: Optimize waveform conversion - https://git.io/J3DFK

10:58 <noopwafel> the pico bridge isn't a submodule of the repo right?

10:59 <azonenberg> Correct, the bridge is fully separate at the moment

10:59 <azonenberg> the intent was to avoid any pico sdk dependencies on libscopehal and glscopeclient at compile time

10:59 <azonenberg> you can build libscopehal on any machine you want and it knows how to talk to the bridge

12:37 <noopwafel> Successfully opened instrument Driver version: PS3000A Linux Driver, 2.1.40.2131

12:37 <noopwafel> so definitely no problems with linking a bunch of different pico libs in

14:14 <noopwafel> http://noopwafel.net/picopicogl.png <- so it's in the same kind of state as my previous bridge was now, which is to say it's totally broken but at least you can see data

14:14 <noopwafel> and I didn't even ifdef the code, I just added a g_pico_type, so I guess that is a perfectly fine approach

14:18 <noopwafel> how is zooming meant to work btw? I'm on laptop here which is maybe not best, but if I zoom in/out by using mousewheel on the header, then I quickly end up at an impossible point (like, +10000s)

14:18 <noopwafel> I had this problem previously too, but I don't think I asked here

16:09 juli9610 has joined #scopehal

16:26 Tost has quit [Ping timeout: 240 seconds]

16:35 bvernoux has joined #scopehal

16:39 <azonenberg> noopwafel: Clamping offset to "approximately around the waveform" is probably not a bad idea when zooming, file a ticket?

16:40 <azonenberg> It's supposed to center wherever your cursor is

16:40 <azonenberg> but in general the touchscreen/touchpad frienliness has room for improvement

16:50 <_whitenotifier-3> [scopehal-pico-bridge] noopwafel forked the repository - https://git.io/JJ0I1

16:52 <noopwafel> https://github.com/noopwafel/scopehal-pico-bridge/commit/058e056fcdf9e4d is the PoC diff on the bridge side, I only made minimal changes on the scopehal side

16:53 <azonenberg> Are 3001/6001 supposed to mean anything? would probably be cleaner to use enums

16:53 <noopwafel> yes it should be an enum :)

16:54 <noopwafel> there's a bunch of things to think about in there

16:54 <azonenberg> also the general project coding style convention is curly braces on their own line

16:55 <noopwafel> yeah, to be clear: not a PR yet :D

16:55 <azonenberg> Yeah I know, just giving you feedback now before you write too much code that you have to rework :)

16:55 <azonenberg> So what's "totally broken"?

16:56 <azonenberg> And how's performance?

16:56 <azonenberg> BTW i've seen a failure mode where sometimes the 6000 series falls back to much slower data transfer rates, acting like it's on usb2

16:57 <azonenberg> i dont know what causes it and if it's host or device side

16:57 <azonenberg> but the workaround is to unplug and replug the usb cable

16:57 <azonenberg> then it starts using usb3 again

17:00 <noopwafel> so it quickly gets unusably slow for me

17:00 <noopwafel> but I think might be unrelated

17:02 <noopwafel> it's just running on my laptop with integrated graphics and that is Not Good Enough

17:02 <noopwafel> I was going to say it kept dying, but actually it is no longer doing so, I think fixed

17:03 <noopwafel> right now, looking at trigger level

17:04 <noopwafel> my old bridge hack set the trigger to 'triggerlevel / voltage * 0x7f00'

17:05 <noopwafel> and I thought your ps6000a code did the same, but then it is not working. hm.

17:31 <noopwafel> will force-push updates to https://github.com/noopwafel/scopehal-pico-bridge/commit/HEAD for now

17:32 <azonenberg> are you changing offset?

17:32 <noopwafel> current status, terminate called after throwing an instance of 'std::bad_alloc', but I grab dinner before I grab gdb :)

17:32 <azonenberg> trigger level is in raw adc codes

17:32 <azonenberg> so if you add an offset you have to correct in the trigger level

17:32 <noopwafel> trigger seems to work now

17:32 <noopwafel> I .. don't know what was going wrong.

17:35 <noopwafel> design-wise: I am tempted to drop the g_range_3000a because it matches the API enum values for 6000a

17:36 <azonenberg> if they're the same, go for it

17:36 <noopwafel> just casting and assume they don't change the API is kind of ugly, but I don't think it makes sense to keep them separate for all the scopes

17:36 <azonenberg> as far as slow graphics did you pull my latest rendering code from last night?

17:36 <azonenberg> There are *major* speedups in there

17:36 <azonenberg> like almost double

17:43 <noopwafel> I did. It does seem to be GPU-limited, at least it's spending most of the time in a DRM ioctl(). I'll probably just move to a machine with an actual GPU later.

17:47 <azonenberg> Improving performance is definitely still a to-do item, i've spent (as you can see) a fair bit of time on rendering performance already and there's still room to imporve

17:48 <azonenberg> If you zoom in on the time scale rather than looking at the whole waveform at once does it get faster?

17:48 <azonenberg> the shaders get a lot faster if they have to process less samples but everything else still works on the full sample dataset

17:48 <azonenberg> So if it gets faster when you zoom you know it's the rendering shader at fault

17:49 <noopwafel> yes, it is the rendering shader :)

17:50 <noopwafel> or, it seems so! I'll investigate after food

17:54 <_whitenotifier-3> [scopehal-apps] azonenberg labeled issue #338: Add speed optimized rendering shader with no intensity grading for low end GPUs - https://git.io/J3SaJ

17:54 <_whitenotifier-3> [scopehal-apps] azonenberg opened issue #338: Add speed optimized rendering shader with no intensity grading for low end GPUs - https://git.io/J3SaJ

17:54 <_whitenotifier-3> [scopehal-apps] azonenberg labeled issue #338: Add speed optimized rendering shader with no intensity grading for low end GPUs - https://git.io/J3SaJ

18:32 elms has quit [Ping timeout: 260 seconds]

18:32 esden has quit [Ping timeout: 260 seconds]

18:34 elms has joined #scopehal

18:36 esden has joined #scopehal

20:10 * xzcvczx didn't realise azonenberg was a dirty decapper

20:26 <GenTooMan> I'm sure he bathes xzcvczx so he's a CLEAN decapper

20:31 <xzcvczx> with the nitric acid he uses to bathe it sure as hell aint gonna be clean

20:48 <GenTooMan> nitric is weird acid. It has to be diluted from it's concentrated form to .. actually work.

20:49 <GenTooMan> still I'm sure he himself isn't dirty.

21:00 <xzcvczx> it is rather funny looking at stuff and finding people you know (of) randomly being called out

21:00 ericonr has quit [Ping timeout: 240 seconds]

21:01 ericonr has joined #scopehal

21:04 <GenTooMan> no person is an island as they say.

21:11 <GenTooMan> you mean funny as in "hah I heard that pseudonym before!" not as in odd I presume also.

21:15 <GenTooMan> erstwhile which scope are you attempting to get to work?

21:40 <xzcvczx> GenTooMan: indeed

22:41 juli9610 has quit [Quit: Nettalk6 - www.ntalk.de]

23:04 <azonenberg> GenTooMan: actually its interesting, when nitric is really strong it behaves more like an oxidizer than an acid

23:04 <azonenberg> The oxidizing nature is what's used in decapping

23:05 <azonenberg> if it gets diluted it starts acting acid-like and corroding metal

23:05 <azonenberg> if really strong it will oxidize the organics in the chip package but passivate the metal

23:15 <sorear> rfna territoriy?

23:21 <azonenberg> Yeah

23:22 <azonenberg> RFNA/WNFA are predominantly oxidizers, 70% is right on the line where the acidic nature begins to dominate the oxidizing nature in this type of situation

23:35 <_whitenotifier-3> [scopehal-apps] fsedano commented on issue #325: GPU hang on iris Plus driver - https://git.io/J39z1

23:36 <d1b2> <fsedano> @azonenberg GPU crash on iris happened again - We need to reopen it, something else is going on

23:37 <_whitenotifier-3> [scopehal] azonenberg opened issue #464: CUDA / cuFFT support - https://git.io/J39gk

23:37 <_whitenotifier-3> [scopehal] azonenberg labeled issue #464: CUDA / cuFFT support - https://git.io/J39gk

23:39 <_whitenotifier-3> [scopehal-apps] azonenberg reopened issue #325: GPU hang on iris Plus driver - https://git.io/J3k5U

23:45 <GenTooMan> azonenberg, they use to use Aniline and RFNA for fuel in Nike Rockets as it was a pyrogolic mixture.

23:48 <sorear> is there anyone here who _hasn't_ read the relevant chapter of _Ignition_ because there may not be a point in discussing things that are already common knowledge

23:49 <GenTooMan> sorear, hmm probably not in this area of IRC

23:53 <azonenberg> fsedano: lovely

23:53 <azonenberg> Did you try my code from last night btw? is it faster at least?