#scopehal on 2020-12-19 — irc logs at freenode.irclog.whitequark.org

2020-12-13 22:53 azonenberg changed the topic of #scopehal to: libscopehal, libscopeprotocols, and glscopeclient development and testing | Online hackathon December 19th all day | https://github.com/azonenberg/scopehal-apps | Logs: https://freenode.irclog.whitequark.org/scopehal

01:04 Degi_ has joined #scopehal

01:07 Degi has quit [Ping timeout: 268 seconds]

01:07 Degi_ is now known as Degi

02:30 <azonenberg> P-525 silicone arrived. Mixed up a batch, now curing in the test mold

02:30 <azonenberg> It's significantly more viscous and actually degasses slightly easier as the bubbles are larger

02:31 <azonenberg> it injection molds fine under syringe pressure but sucking it up from the jar with a pipette is a bit slow. Might switch to a plastic spoon or something

02:57 <azonenberg> Got good results with 5 grams of total silicone... 2.5 grams each of base and catalyst. It filled the mold nicely with a comfortable amount left over

03:42 electronic_eel has quit [Ping timeout: 265 seconds]

03:42 electronic_eel has joined #scopehal

03:43 <d1b2> <theorbtwo> Oooh, just saw the scopehal/glscopeclient hackathon thing. When is it starting? It's already been the 19th here for almost 4 hours!

03:59 <azonenberg> Whenever you wanna start working on stuff :p

03:59 <azonenberg> Got a specific feature/ticket you wanted to work on?

04:00 <azonenberg> It's still the 18th here in PST but that won't stop me from coding on stuff

04:01 <azonenberg> theorbtwo: I can get you test waveforms of lots of stuff if you don't (yet) have a supported scope

04:01 <d1b2> <theorbtwo> Still thinking on it. I've not actually used it yet. I might have to start by struggling with build systems... or am I blind and the "actions" thing will give me artifacts? I don't see them.

04:02 <azonenberg> I would start by getting it to build locally. I think we have some build artifacts from the CI builds but if you're going to be doing any sort of development you need to be able to compile locally

04:02 <d1b2> <theorbtwo> ...I am blind, yes.

04:02 <azonenberg> Section 3.4 of the manual is a good starting point to get it building on your system

04:16 <d1b2> <theorbtwo> So far, fwiw, my thoughts are that it would be nice to have a bit on the manual for quick getting started with a few different scopes, preferably with both usb and ethernet. A packaged-up windows binary. A way of integrating / feeding wireshark or ... blast, open source logic analyzer that I forget the name of, for when you want to go extreme full stack. Heck, while I'm thinking of it, a way of taking your scope session and compiling it to a

04:16 <d1b2> library, or fpga, or something like that ... OK, that last one might be too crazy.

04:16 <azonenberg> We already have a way of streaming to wireshark for all Ethernet protocol decodes

04:16 <azonenberg> USB is i believe the only other protocol that both glscopeclient and wireshark have decodes for

04:17 <azonenberg> there is an open ticket for adding the bridge

04:17 <d1b2> <theorbtwo> Ah, cool.

04:17 <azonenberg> The tl;dr is that it's just pcap export

04:17 <azonenberg> except you mkfifo a pipe and export to that

04:17 <azonenberg> then have wireshark stream from it

04:17 <d1b2> <theorbtwo> I figured it was something like that.

04:17 <azonenberg> so basically it's something the user does

04:18 <azonenberg> from scopehal's perspective it's pcap export

04:19 <d1b2> <theorbtwo> Is mostly a matter of documenting that, and showing it off a bit, maybe.

04:20 <azonenberg> Makes sense. But I think before we do that, it's more important to have complete documentation of all the decode blocks used in the chain

04:20 <d1b2> <theorbtwo> Fair.

04:20 <azonenberg> i believe right now 9 of the 90 filter classes actually have complete documentation

04:20 <azonenberg> I would *love* if you wanted to work on that. I can provide you with test waveforms for using most of them on

04:21 <azonenberg> so you can fool around, get an idea for how they work

04:21 <azonenberg> there should be enough templates in the documentation that you can easily write docs for additional filters

04:21 <azonenberg> honestly that is one of my highest priorities for an initial release

04:21 <azonenberg> Have all major features documented

04:22 <azonenberg> the core of the UI is decently well documented now, the filters are not

04:22 <d1b2> <theorbtwo> Hmm. Tex. I keep thinking I ought to learn some of that. No time like the present, I suppose.

04:23 <azonenberg> The syntax used in the filter documentation is pretty simple. you can mostly cut and paste from other filters to figure out the stuff we're using

04:27 <d1b2> <theorbtwo> I figured, though I suspect I will want to add some other bits that aren't templated yet, and maybe another feature or two.

04:28 <azonenberg> anyway first step is to get it building and running locally

04:28 <azonenberg> you can use the demo driver to start

04:28 <d1b2> <theorbtwo> It would be great to replace the hand-curated pngs in the docs with a session file + a wav file and build the images.

04:28 <azonenberg> So the challenge there is composition

04:28 <azonenberg> cropping, do you show just the input channel or several for context

04:29 <d1b2> <theorbtwo> Yep, I'm working on it while I read the manual and chat here.

04:29 <azonenberg> does the timeline or y axis need to be visible to understand what the filter does, or should it be removed to focus attention on the decode

04:29 <azonenberg> I think hand cropped screenshots is the better choice for now

04:29 <azonenberg> If we do major refactoring of a single decode that changes the appearance drastically, it's not hard to grab a new screenshot

04:29 <d1b2> <theorbtwo> If all of that isn't specifiable in a session file, it probably should be anyway, but yeah, not a good thing to do right now.

04:30 <azonenberg> and massive changes to the UI as a whole aren't super likely

04:30 <azonenberg> also you wouldnt need a wav

04:30 <azonenberg> a saved session can include waveform data

04:30 <d1b2> <theorbtwo> Oh, good.

04:30 <azonenberg> the foo.scopesession is just yaml metadata but there's an accompanying foo_data directory with waveforms and another yaml metadata file for each one

04:31 <azonenberg> you can choose to export a light session with only the UI/instrument config, or include the data

04:31 <azonenberg> depending on what you want to do with it later

04:31 <azonenberg> you can also decide to only import data, ui settings, or both

04:31 <azonenberg> a scopesession does include info about filter config, window size, position and zoom of each waveform group, etc

04:32 <azonenberg> But it doesn't include the kind of crop info you'd need for autogenerating screenshots

04:32 <azonenberg> as far as what portion of the app window is of interest for the documentation

04:32 <d1b2> <theorbtwo> I wasn't thinking of crop so much as hiding UI elements.

04:32 <azonenberg> Oh. Not as of now. Generally these are critical things that need to be there to interact with the scope

04:33 <azonenberg> like, if you remove the Y axis you have no way to change gain/offset

04:33 <azonenberg> as that is simultaneously the display and the control

04:33 <d1b2> <theorbtwo> Yeah, it's the sort of function that would be useful in very limited circumstances ... like generating screenshots for documentation.

04:33 <d1b2> <theorbtwo> ...but also full-stack tests of your UI.

04:34 <azonenberg> I mean veeeery long term having scripted firefox-style pixel checking tests to find rendering regressions would be great

04:34 <azonenberg> We're a long ways from being stable enough that we *need* such techniques to find bugs :p

04:34 <d1b2> <theorbtwo> Not great tests, because validating them is a matter of "is the output of this pixel-identical to it was before", but it's better than nothing.

04:34 <azonenberg> It's a regression testing tool, not more

04:34 <azonenberg> you define X as "good" then change "good" as the UI evolves

04:36 <d1b2> <theorbtwo> I love "...to prevent causality violations", BTW. It wouldn't do at all to accidentally destroy the universe with your oscilloscope.

04:36 <azonenberg> lol

04:39 <d1b2> <theorbtwo> Is there no "arbitrary arithmetic" filter? Are the various ethernet filters similar enough to each-other that they should have their own sub-tree of documentation, with a long section on what they have in common followed by a small section each on what they have different?

04:39 <azonenberg> There is no "arbitrary arithmetic" filter, no. Right now i believe we have "subtract one waveform from another", "add scalar to a waveform", and "multiply one waveform by another"

04:39 <azonenberg> an arbitrary math filter would be nice to have the trick would be making it fast. We might want to consider some sort of JIT

04:40 <azonenberg> because the existing filters are all hand tuned AVX

04:40 <d1b2> <theorbtwo> Yow.

04:40 <azonenberg> i mean there's generic C++ versinos too

04:40 <azonenberg> but the ones that are used most of the time are heavy AVX2 or AVX512 intrinsics

04:41 <d1b2> <theorbtwo> I'm surprised there's not computation-on-GPU on the mix as well.

04:41 <azonenberg> The AVX512 version of the FIR filter block for example is unrolled to do 64 samples per iteration

04:41 <azonenberg> 16 way SIMD * 4-way unrolling

04:41 <azonenberg> And is about eight times faster than the generic C++ version with -O3

04:42 <azonenberg> Pushing compute to GPU is on the long term roadmap. It always has been

04:42 <azonenberg> There's just challenges around logistics especially marshaling data to and from the gpu and figuring out when it makes sense to do so

04:42 <azonenberg> there are latency costs involved with jumping to the gpu

04:42 <azonenberg> ideally i'd like to do some profiling at startup or even silently in the background as the app runs

04:43 <azonenberg> and figure out if the throughput gains from going from AVX to GPU on a given filter exceed the latency costs to give a shorter end to end run time

04:43 <azonenberg> The tradeoff will likely vary with each specific system, from one filter to another, and also depending on the number of samples in the waveform

04:44 <d1b2> <theorbtwo> Hm. Other then the one you just mentioned, the other thing would be average of two signals. I've been thinking about gvif lately, which is a bidirectional signal over a single shielded twisted pair. One direction is common, the other direction is differential.

04:44 <azonenberg> So you mean extracting the common mode from a differential signal, essentially?

04:44 <d1b2> <theorbtwo> Yeah.

04:45 <azonenberg> Yeah that sounds good. File a ticket and we'll figure out what to call it and how generic to make it

04:45 <azonenberg> for example should it support arbitrarily many inputs or only two

04:45 <d1b2> <theorbtwo> Potentially also useful for things like "do I need to put shielding on this twisted pair or not".

04:45 <azonenberg> Yes i agree

04:45 <azonenberg> It's good to have

04:46 <azonenberg> I've just never needed it yet

04:46 <azonenberg> Anyway, there are a few other issues around pushing compute to gpu

04:46 <azonenberg> for example. right now all of the gpu compute is rendering related and is using opengl compute shaders

04:46 <d1b2> <theorbtwo> Yeah, that was more a random thought then an actual feature request.

04:46 <azonenberg> Which make it super easy to exchange data with opengl rendering

04:47 <azonenberg> But on the flip side, it requires you have an active opengl context

04:47 <azonenberg> So the question is, what happens if we want to use libscopehal headless? On a system that may or may not have a GPU or an X server?

04:47 <azonenberg> Do we silently fall back to software compute? refuse to run at all?

04:47 <azonenberg> create a GL window in the background and use it?

04:48 <azonenberg> none of these are impossible or even ultra difficult problems, they've just never been top of the priority list

04:48 <d1b2> <theorbtwo> Yeah, it becomes yet another arch to write a different impl of the same algo in to optimize it.

04:48 <azonenberg> oh also, some filters just don't parallelize well. Like clock recovery

04:48 <azonenberg> so that adds another wrinkle to the choice of whether to GPU or not to GPU

04:48 <azonenberg> ideally we'd know what the latency and bandwidth of pushing data between the two is

04:49 <d1b2> <theorbtwo> Especially since clock recovery will often be an early stage that you are going to be doing more on top of.

04:49 <azonenberg> and perhaps dynamically decide which version of a filter to use based on where the input data is currently located and what filters are going to use our output

04:49 <azonenberg> in order to minimize the number of copies

04:49 <azonenberg> And well often preprocessing like de-embeds, subtracting halves of a diffpair, etc comes first

04:49 <azonenberg> Those are very data heavy operations that would benefit from going to GPU. FFT is actually one of the first things that going to GPU would likely benefit from

04:49 <Bird|otherbox> theorbtwo: learning LaTeX is definitely worthwhile, if nothing else for the quality of printed output you can produce with it -- it's not a substitute for a DTP tool, but it's still bloody awesome nonetheless

04:50 <azonenberg> Which is used by channel emulation and de-embedding

04:50 <azonenberg> anyway, so ideally i would like to be able to look at a complex filter graph

04:50 <azonenberg> maybe run one or two iterations of it with different configs

04:50 <azonenberg> and dynamically determine which to run on CPU and which on GPU

04:50 <azonenberg> Considering that some blocks may only exist for one or the other

04:51 <azonenberg> e.g. right now, protocol decodes that output complex samples ("ethernet frame segment") are rendered in Cairo in software

04:51 <azonenberg> so you have less latency of that final decode happens on the CPU

04:51 <azonenberg> while anything that outputs an analog or digital waveform is rendered in shaders

04:51 <azonenberg> so it's most efficient if the last filter in the chain is on the GPU and we don't have to copy the data off

04:51 <azonenberg> What happens if we have a filter whose output needs to go to both? how do we maintain coherency?

04:52 <azonenberg> I think pushing compute to GPU is going to be a post v1.0 feature. It's going to be a massive game changer in terms of performance we can get, and it's going to be crucial for keeping up with the FPGA accelerated scopes I'm planning that will push out many Gbps of waveform

04:52 <azonenberg> But it's also going to be a LOT of work

04:52 <azonenberg> to do it right

04:55 <azonenberg> I think OpenCL is probably also going to be the way to go as far as gpu compute is concerned

05:08 _whitelogger has joined #scopehal

05:28 <d1b2> <theorbtwo> The win32 install instructions don't include installing git. Am tempted to suggest that the documentation for filters can be partly generated by the code. Finished writing the bottom part of "Clock Recovery (D-PHY HS Mode)".

05:29 <azonenberg> And yes we definitely could script part of it. But again, unlikely to be worth the time given that once the initial build-out is done, we won't be doing any more mass documentation

05:30 <azonenberg> the hope is that all new PRs with filters include a matching PR for the docs

05:30 <d1b2> <theorbtwo> It will keep the documentation automatically up to date.

05:30 <azonenberg> Up to a point, but there's still going to be things like detailed descriptions of filter theory of operation, what the inputs do, etc that has to be done by hand

05:31 <azonenberg> and realistically once a filter is created, what are the odds the inputs will change significantly?

05:31 <azonenberg> we might optimize or fix bugs in the decoding

05:31 <azonenberg> but the interface is likely to remain stable

05:31 <d1b2> <theorbtwo> Inputs, maybe not. Parameters seems more likely, or adding extra outputs.

05:34 <azonenberg> i would not expect to add outputs after the fact either. Parameters i could see

05:34 <azonenberg> but again if you add a new parameter you need to add a description as that's not going to be present in the source anyway

05:34 <azonenberg> is adding one line of text to the docs really so hard in that case?

05:34 <azonenberg> If anything i think what would make more sense is an automated checker that reports undocumented stuff

05:35 <d1b2> <theorbtwo> OK, I can see that.

05:48 <azonenberg> pepijndevos: ping

06:02 <azonenberg> https://www.antikernel.net/temp/IMG_20201218_215855.jpg ok so this is the first test cast made with the new silicone

06:02 <azonenberg> A few air bubbles, in large part due to me screwing up and pushing the syringe plunger down all the way and injecting some air at the very end of the run

06:02 <azonenberg> (that huge bubble is right next to the sprue)

06:03 <azonenberg> but easily solved with better technique and some practice. I have a procedure now that seems to work well for removing mold flash so that worked out nicely

06:03 <azonenberg> The A25 rubber is much better feeling than the A8 for this purpose. I think i'm going to stick with it

06:42 <azonenberg> theorbtwo: how's it going? I'm probably going to go to sleep since it's almost 23:00 Friday for me, then resume first thing in the morning

06:43 <azonenberg> Which would be mid afternoon for you probably

06:43 <azonenberg> pepijndevos: I have a preliminary implementation of write queueing in the Tek driver. Reads still are blocking

06:44 <azonenberg> I'd like to do some wider scale testing of it and see how things perform

06:44 <azonenberg> Would you be interested in experimenting with adding this write queueing API to the Rigol driver and seeing how it works?

06:44 <d1b2> <theorbtwo> I wrote some docs, but have failed to compile either under linux or windows, so they are untested even as far as compiling.

06:45 <azonenberg> Failed to compile the docs or everything?

06:46 <d1b2> <theorbtwo> I'm taking a break at present, and will probably go to bed myself as soon as I deliver this load in-game. See you tomorrow.

06:46 <azonenberg> What failed in particular? Did you follow the linux build procedure?

06:46 <azonenberg> ah ok

06:47 <d1b2> <theorbtwo> Linux problem isn't the build instructions, it's my damn system deciding it doesn't want to install texlive-fonts-extra for some reason.

06:47 <azonenberg> Oh

06:47 <d1b2> <theorbtwo> I hate build systems. And sysadmin. Don't mind me.

07:32 __dre has joined #scopehal

07:33 <azonenberg> o/ __dre

07:34 <__dre> Hello! This is @CyberpunkDre on twitter, setting up my IRC for tomorrow

07:35 <azonenberg> Great

07:35 <azonenberg> Just making sure everything is good before I go to bed

07:35 <azonenberg> Also FYI theobrtwo has been doing a bit of documentation work too, so please coordinate with them to make sure you don't duplicate effort

07:36 <__dre> Sounds good, will do! Still reading through the documentation which seems to be building daily?

07:37 <azonenberg> I try to update the version in /temp/ fairly frequently

07:37 <azonenberg> there isn't an official release yet, that's a dev draft

07:38 <__dre> Yeah I saw, honestly the scope of the protocols is quite large. I am also interested in looking at Tektronix drivers, I have a model thats not on your supported list

07:38 <__dre> TDS3054

07:40 <azonenberg> And yes there are a lot of protocols. It should be easy to avoid colliding

07:40 <azonenberg> just don't start alphabetically and pick the next undocumented one without making sure nobody else is working on it :p

07:41 <azonenberg> As far as Tek stuff, great. I have a MSO64 on loan for the holidays and have been doing a lot of work on that driver

07:41 <azonenberg> Unsure how much of the protocol is shared with older Tek models

07:44 <__dre> Nice, hopefully shares enough to make it easy but should be fun part to add that seems unclaimed

07:46 <__dre> I'll probably try focusing a bit more tomorrow when more people are on and I see what's being worked on though :P

07:47 <__dre> Need to go get more stuff ready here, but glad I caught you on :) nite!

07:47 <azonenberg> Yeah to my knowledge nobody else is working on Tek stuff. there is an open ticket for i believe DPO7000 series but nobody is currently working on that

07:48 <azonenberg> there is a guy in another channel who has one and can probably arrange VPN access for dev

07:48 <azonenberg> But definitely best to start with what you have in front of you

08:01 __dre has quit [Quit: Dre has quit the channel]

08:02 CyberDre has joined #scopehal

08:07 futarisIRCcloud has quit [Quit: Connection closed for inactivity]

10:33 __dre has joined #scopehal

10:33 CyberDre has quit [Ping timeout: 260 seconds]

14:56 <azonenberg> So i'm beginning the early stages of exploring OpenCL integration for filter computation

14:57 <_whitenotifier> [scopehal] azonenberg pushed 1 commit to master [+1/-0/±2] https://git.io/JLENh

14:57 <_whitenotifier> [scopehal] azonenberg 66971d5 - Initial CMake OpenCL detection. Not used for anything yet.

14:57 <_whitenotifier> [scopehal-apps] azonenberg pushed 1 commit to master [+0/-0/±2] https://git.io/JLEAf

14:57 <_whitenotifier> [scopehal-apps] azonenberg 59877f5 - Initial OpenCL CMake detection. Not used for anything yet.

15:05 <_whitenotifier> [scopehal] azonenberg pushed 1 commit to master [+0/-0/±2] https://git.io/JLExF

15:05 <_whitenotifier> [scopehal] azonenberg e83f65d - Set default paths if no OpenCL found

15:08 <_whitenotifier> [scopehal] azonenberg pushed 1 commit to master [+0/-0/±1] https://git.io/JLEpY

15:08 <_whitenotifier> [scopehal] azonenberg 29b52e9 - Set OpenCL version 1.2 since nvidia doesn't do 2.0

15:08 <_whitenotifier> [scopehal-apps] azonenberg pushed 1 commit to master [+0/-0/±2] https://git.io/JLEpG

15:08 <_whitenotifier> [scopehal-apps] azonenberg 20332e0 - Print OpenCL support present/absent message during startup

15:14 <_whitenotifier> [scopehal] azonenberg pushed 1 commit to master [+0/-0/±1] https://git.io/JLEhc

15:14 <_whitenotifier> [scopehal] azonenberg 5db4455 - Fixed HAVE_OPENCL being defined even if value is 0

15:15 <_whitenotifier> [scopehal-apps] azonenberg pushed 1 commit to master [+0/-0/±1] https://git.io/JLEhC

15:15 <_whitenotifier> [scopehal-apps] azonenberg 268f4ef - Updated submodules

15:46 juli966 has joined #scopehal

15:53 <_whitenotifier> [scopehal] azonenberg pushed 1 commit to master [+0/-0/±2] https://git.io/JLukj

15:53 <_whitenotifier> [scopehal] azonenberg 962b65d - Initial OpenCL context creation

16:24 <_whitenotifier> [scopehal] azonenberg pushed 1 commit to master [+1/-0/±5] https://git.io/JLuOD

16:24 <_whitenotifier> [scopehal] azonenberg cb4eb5b - Initial OpenCL kernel creation logic. Seems to work. Test kernel doesn't actually do anything yet and is never called.

16:26 <_whitenotifier> [scopehal-docs] azonenberg pushed 1 commit to master [+0/-0/±1] https://git.io/JLuOj

16:26 <_whitenotifier> [scopehal-docs] azonenberg 4386dcd - Initial mention of OpenCL in documentation

16:26 <_whitenotifier> [scopehal-apps] azonenberg pushed 1 commit to master [+0/-0/±4] https://git.io/JLu3k

16:26 <_whitenotifier> [scopehal-apps] azonenberg f513d99 - Copy kernels when building

16:55 <_whitenotifier> [scopehal] azonenberg pushed 1 commit to master [+0/-0/±1] https://git.io/JLuCU

16:55 <_whitenotifier> [scopehal] azonenberg f5187e9 - Fixed copy paste error

16:55 <_whitenotifier> [scopehal-apps] azonenberg pushed 1 commit to master [+0/-0/±1] https://git.io/JLuCL

16:55 <_whitenotifier> [scopehal-apps] azonenberg 3698d15 - Updated scopehal

17:53 <_whitenotifier> [scopehal] azonenberg pushed 1 commit to master [+0/-0/±6] https://git.io/JLuuS

17:53 <_whitenotifier> [scopehal] azonenberg e7d8e19 - Initial OpenCL FIR filter support

17:53 <_whitenotifier> [scopehal-apps] azonenberg pushed 1 commit to master [+0/-0/±1] https://git.io/JLuuH

17:53 <_whitenotifier> [scopehal-apps] azonenberg eb23906 - Updated submodules

17:53 <azonenberg> woooo

17:54 <azonenberg> lain, monochroma, marshallh: Very early stage OpenCL filter support is in. So far very manual buffer management

17:54 <azonenberg> Long term plan is to do some smart stuff and only copy buffers when necessary

17:54 <azonenberg> I'm also doing the final min/max reduction for display bounds calculation on the CPU still

17:55 <azonenberg> anyway... test system is 2x Xeon 6144 + RTX 2080 Ti

17:55 <azonenberg> 800K points at 40 Gsps, 511 tap FIR LPF

17:55 <azonenberg> Generic C++: 103 ms

17:55 <azonenberg> AVX2: 17.1 ms

17:55 <azonenberg> AVX512F: 11.1 ms

17:56 <azonenberg> Unoptimized OpenCL (allocating a new buffer each call, etc): 8.1 ms

17:57 <azonenberg> So basically this very naive implementation with blocking-everything and probably lots of unnecessary copying of data back and forth is about 30% faster than a Xeon with AVX512, and twice as fast as a less fancy CPU with only AVX2

17:58 <monochroma> :D

17:59 <azonenberg> Once *most* processing is done on the GPU i expect lots of the copy overhead to vanish

18:00 <azonenberg> ideally i would copy raw 8 bit ADC samples to the GPU, use an opencl kernel to convert that to fp32, and keep everything i can in GPU memory

18:00 <azonenberg> only copying to the CPU for filters that don't parallelize well like upper layer protocols

18:01 <azonenberg> I think long term what i will need to do is have each waveform contain both gpu and cpu output buffers

18:01 <azonenberg> and validity flags

18:01 <azonenberg> then have each filter set a gpu or cpu flag

18:01 <azonenberg> so if a gpu filter needs an input without a gpu output it will make a gpu buffer and copy it

18:01 <azonenberg> and if a cpu filter needs an input that is only resident on the gpu it will copy off

18:01 <azonenberg> etc

18:16 <tnt> azonenberg: not sure about all your requirements, but you might also want to look into opengl compute if you ever need to actually display the data. Sharing CL/GL sucks (from experience) and there is also much better support for GL compute than CL.

18:18 <azonenberg> tnt: meanwhiel i'm seeing the opposite

18:18 <azonenberg> CL seems to be more widely supported than GL, especially since it can potentially run in generic software rather than only GPUs

18:19 <azonenberg> right now rendering is all done in gl compute

18:19 <azonenberg> but i'm actually thinking of moving everything to CL and then using GL for the final compositing step

18:21 <tnt> lol ok, I guess the grass is always greener on the other side, I'm been doing CL/GL and thinking of moving everything to GL compute because of all the issues.

18:23 <tnt> Nvidia still being at CL 1.1 with no update coming and no support ever coming for the embedded chips. AMD drivers fragmented in like 3 implementations, none of which I managed to get CL/GL buffer sharing working. My intel GPU support GL compute but no meaningful CL, etc ...

18:24 <_whitenotifier> [scopehal] azonenberg pushed 1 commit to master [+0/-0/±1] https://git.io/JLuwI

18:24 <_whitenotifier> [scopehal] azonenberg 925addb - FIRFilter: optimizations to buffer handling

18:25 <_whitenotifier> [scopehal] azonenberg pushed 1 commit to master [+0/-0/±1] https://git.io/JLuws

18:25 <_whitenotifier> [scopehal] azonenberg 45cb126 - Removed temporary benchmark code

18:25 <azonenberg> nvidia is at CL 1.2

18:25 <monochroma> a friend who does a lot of compute stuff was saying the same that CL is basically dead, in that none of the big "3" are actually keeping up and supporting it

18:25 <azonenberg> That's what i'm using

18:26 <tnt> yeah sorry 1.2, but I meant not CL 2.0

18:26 <azonenberg> Well, I'm still experimenting to see. also i got the CL version down to 5.9 ms with some very trivial changes to buffer handling

18:26 <tnt> Well AMD and Intel supposedly support it, but my experience with their official supported stuff has been worse than with NVidia trying to bury it.

18:27 <tnt> (which tbh sucks because I kind of like CL's model)

18:27 <azonenberg> So that's now a 17.4x speedup vs the generic C++

18:28 <azonenberg> And 1.8x vs reasonably well tuned, loop-unrolled AVX512F

18:28 <tnt> Nice.

18:29 <tnt> Not everyone has a RTX2080 Ti though :D

18:29 <azonenberg> honestly i dont think i am close to saturating the gpu. i showed 5-10% utilization

18:30 <azonenberg> i think most of the bottleneck was actually just pulling data off everything

18:30 <azonenberg> i.e. limited by pcie latency

18:30 <azonenberg> About to fire up nsight compute and see what it says

18:30 <d1b2> <daveshah> I couldn't get that working for OpenCL when I tried

18:31 <d1b2> <daveshah> but maybe I was doing something wrong

18:31 <azonenberg> Well the good news is worst case if i have to copy off

18:31 <azonenberg> if i move rendering all to opencl

18:31 <azonenberg> the only thing i have to pull off is essentially one fullscreen framebuffer worth of pixels

18:31 <azonenberg> then shove right back into a texture and do opengl compositing

18:31 <azonenberg> So it wouldnt be THAT much overhead

18:35 <azonenberg> also the massive advantage of opencl is it doesnt require a window on screen

18:35 <azonenberg> Which means it can be used in headless compute

18:36 <azonenberg> very important for some planned libscopehal use cases

18:39 <azonenberg> That was actually the #1 reason i wanted to look at CL in the first place

18:39 <azonenberg> getting compute shaders running headless is a pain

18:41 <ericonr> there's also vulkan, OpenCL 3.0 was going to be about inter operation there

18:42 <ericonr> no idea if anyone's implemented it

18:42 <azonenberg> GTK on debian stable does not have vulkan integration

18:42 <azonenberg> It might be worth considering in several more years

18:48 <azonenberg> Also it looks like AMD has a decent looking opencl fft library

19:26 <azonenberg> Now down to 4.4 ms on the same test

19:26 <azonenberg> notbad.jpg

19:29 jevinskie[m] has joined #scopehal

19:38 <_whitenotifier> [scopehal] azonenberg pushed 1 commit to master [+0/-0/±2] https://git.io/JLuyC

19:38 <_whitenotifier> [scopehal] azonenberg cfe08ff - FIRFilter: now do first stage of reduction on the GPU too

19:39 <azonenberg> and 4.1

19:40 <azonenberg> Ok i think i'm done tweaking. it's already 25x faster than the generic C++ and 2.7x faster than the AVX512F

19:58 ericonr has quit [Ping timeout: 240 seconds]

20:03 ericonr has joined #scopehal

20:12 pd0wm has joined #scopehal

20:12 pd0wm has quit [Client Quit]

20:13 pd0wm has joined #scopehal

20:13 <azonenberg> o/ pd0wm

20:19 <_whitenotifier> [scopehal] azonenberg commented on issue #146: Support for HiSLIP - https://git.io/JLudJ

20:32 pd0wm has quit [Remote host closed the connection]

20:33 <_whitenotifier> [scopehal] azonenberg edited issue #341: Split Siglent driver off from LeCroyOscilloscope as they've diverged too much - https://git.io/JksiG

20:33 <_whitenotifier> [scopehal] azonenberg labeled issue #341: Split Siglent driver off from LeCroyOscilloscope as they've diverged too much - https://git.io/JksiG

20:35 <azonenberg> So i'm thinking it would be nice to add CSV export (scopehal-apps:#77)

20:36 <azonenberg> the question is how we want to handle the case of signals that aren't regularly sampled

20:36 <azonenberg> like protocol decodes

20:36 <azonenberg> Exporting a single channel to CSV is trivial

20:36 <azonenberg> but exporting *everything* is hard

20:37 <azonenberg> and it's not like i can just resample an ethernet frame to match whatever the base sample rate is :p

20:38 <azonenberg> I suppose one option would be to loop over all selected channels

20:38 <azonenberg> Emit a CSV line for every unique timestamp found in any channel

20:38 <azonenberg> but only put a value in the cell on a change?

20:39 <azonenberg> i don't want to repeat a sample for its entire duration because then you'd see repeated bytes in decodes

20:39 <azonenberg> although stretching might work for analog samples

20:39 <azonenberg> those might make more sense to resample?

20:59 __dre is now known as CyberDre

20:59 CyberDre has quit [Quit: Dre has quit the channel]

21:00 Cyber_Dre has joined #scopehal

21:37 <azonenberg> o/ Cyber_Dre

21:45 <Cyber_Dre> heyo, this is still CyberpunkDre, actually need to afk for a bit but will be back later

21:45 <azonenberg> ok

21:46 Cyber_Dre is now known as dre_afk

21:47 <_whitenotifier> [scopehal] pd0wm forked the repository - https://git.io/JLzfl

21:49 <_whitenotifier> [scopehal] pd0wm opened pull request #388: Verify CAN CRC (#333) - https://git.io/JLzfP

21:54 <_whitenotifier> [scopehal] pd0wm synchronize pull request #388: Verify CAN CRC (#333) - https://git.io/JLzfP

21:55 <_whitenotifier> [scopehal] pd0wm edited pull request #388: Verify CAN CRC (#333) - https://git.io/JLzfP

21:55 pd0wm has joined #scopehal

21:59 <_whitenotifier> [scopehal] pd0wm edited pull request #388: Verify CAN CRC (#333) - https://git.io/JLzfP

22:12 <_whitenotifier> [scopehal-apps] azonenberg pushed 1 commit to master [+0/-2/±8] https://git.io/JLzTy

22:12 <_whitenotifier> [scopehal-apps] azonenberg f1d95f5 - Fixed persistence rendering. Now using a preference for setting the decay rate. Fixes #56.

22:12 <_whitenotifier> [scopehal-apps] azonenberg closed issue #56: Persistence doesn't work with new rendering engine - https://git.io/JvRWv

22:13 <azonenberg> pd0wm: are you done editing that PR? is it ready to merge from your perspective?

22:13 <miek> azonenberg: do you have a trigger type that triggers on the nth edge after a timeout on any of your scopes?

22:14 <pd0wm> Yeah, it's ready to merge if you think it's good!

22:14 <azonenberg> pd0wm: gimme a min to look

22:14 <pd0wm> I didn't have any dumps of messages with a wrong CRC, so couldn't fully test it of course

22:15 <azonenberg> pd0wm: yeah i dont have any handy either

22:15 <pd0wm> But I did invert the check on crc and confirmed it showed up as red in the ui

22:15 <azonenberg> Ok good

22:15 <azonenberg> For line 435 i'd put the thing in parentheses

22:15 <azonenberg> foo = bar == baz is a little hard to read

22:16 <azonenberg> My general rule is that if you have to stop and think about precedence, add parentheses

22:16 <azonenberg> Fix that and it's good to merge

22:16 <pd0wm> Sure!

22:17 <pd0wm> Like: bool crc_ok = (current_field == (crc & 0x7fff)); ?

22:17 <azonenberg> Yes

22:17 <azonenberg> It's just a little more readable that way

22:18 <_whitenotifier> [scopehal] pd0wm synchronize pull request #388: Verify CAN CRC (#333) - https://git.io/JLzfP

22:18 <azonenberg> miek: My lecroy scopes have "qualified triggers"

22:18 <pd0wm> Done!

22:19 <azonenberg> So I can arm the trigger on either an edge on one channel, or a parallel digital pattern across some/all channels being present

22:20 <azonenberg> Then I can do an interval trigger wfter that

22:20 <azonenberg> not quite what you're lookin for

22:20 <azonenberg> The other thing i can do is cascaded triggering

22:22 <azonenberg> so i can arm on a dropout, then skip N edges, then actually trigger on an edge

22:22 <azonenberg> You can cascade up to four levels of triggering

22:22 <azonenberg> I've never used it, and the libscopehal API does not support it

22:22 <miek> ok, cheers. i'm gonna implement it for the keysight/agilent and just wanted to get naming right if it was gonna be shared, but it doesn't sound like there'll be any crossover

22:23 <azonenberg> https://www.antikernel.net/temp/multistage.png

22:23 <azonenberg> the way i would implement this in our object model is to have a single CascadedTrigger object which contains a vector<Trigger*>

22:23 <azonenberg> and some settings of its own like "number of events of trigger A before arming B"

22:24 <azonenberg> the challenge will be a) designing the UI for each stage and b) imposing restrictions

22:24 <azonenberg> for example, i don't think you can normally use serial protocol triggers in a cascaded trigger on lecroy

22:24 <azonenberg> So you cannot have "trigger on the second press of this button after you see 'F' on the uart' "

22:25 <_whitenotifier> [scopehal] azonenberg closed issue #333: CAN decode: Check packet CRCs - https://git.io/JTdix

22:26 <_whitenotifier> [scopehal] azonenberg pushed 4 commits to master [+0/-0/±4] https://git.io/JLzIE

22:26 <_whitenotifier> [scopehal] pd0wm b1ba421 - initial support for CAN checksums

22:26 <_whitenotifier> [scopehal] pd0wm c15bc13 - fix formatting

22:26 <_whitenotifier> [scopehal] pd0wm da5dd5a - add parentheses

22:26 <_whitenotifier> [scopehal] azonenberg 86f127f - Merge pull request #388 from pd0wm/master Verify CAN CRC (#333)

22:26 <_whitenotifier> [scopehal] azonenberg closed pull request #388: Verify CAN CRC (#333) - https://git.io/JLzfP

22:27 <azonenberg> pd0wm: Looks good, thanks and congrats on your first contribution :)

22:28 <miek> on the agilent (and some rigols i think) it's a standalone trigger type, so i'll just make it a simple class along the same lines as the existing triggers https://i.imgur.com/MZNd4XW.png

22:28 <azonenberg> pd0wm: I'm also going to add you to the list of contributors... would you prefer to be listed by your real name, your github handle, or something else?

22:29 <azonenberg> miek: yes that looks much less general and best done as its own trigger type

22:29 <azonenberg> with lecroy it's basically a state machine wrapping all of the simple trigger types

22:31 <azonenberg> MAXWELL's trigger engine is going to be similar to this too

22:54 <miek> btw, i'm not able to build with the CL changes - cmake sets OpenCL_INCLUDE_DIR but not OpenCL_INCLUDE_DIRS, are you definitely getting DIRS on your end?

22:57 <azonenberg> Yes i am getting dirs on my end. It just defaults to /usr/include

22:58 <azonenberg> oh wait

22:58 <azonenberg> no

22:58 <azonenberg> it's DIR not DIRS, that's a typo

22:58 <azonenberg> and i get away with it because /usr/include was already in my search path

22:59 <_whitenotifier> [scopehal] azonenberg pushed 1 commit to master [+0/-0/±1] https://git.io/JLzYL

22:59 <_whitenotifier> [scopehal] azonenberg 2a26351 - Fixed typo in variable name

22:59 <_whitenotifier> [scopehal-apps] azonenberg pushed 1 commit to master [+0/-0/±1] https://git.io/JLzYq

22:59 <_whitenotifier> [scopehal-apps] azonenberg 00d7d08 - Updated to latest scopehal

22:59 <azonenberg> miek: try that

23:02 <miek> cheers

23:02 <miek> ahh, i was also missing the opencl-clhpp-headers package

23:03 dre_afk is now known as Cyber_Dre

23:04 <pd0wm> azonenberg: github handle is fine!

23:06 <azonenberg> miek: but cmake detected CL anyway? so you have half an install and it got confused because no headers?

23:07 <_whitenotifier> [scopehal-apps] azonenberg pushed 1 commit to master [+0/-0/±1] https://git.io/JLzYj

23:07 <_whitenotifier> [scopehal-apps] azonenberg 8bb0b61 - OscilloscopeWindow: added another contributor to about dialog

23:07 <azonenberg> pd0wm: Great, done. Planning to work on anything else today?

23:07 <miek> yeah, it's a bit weirdly packaged

23:09 <pd0wm> It's already pretty late over here. I'll see if I can get my Rigol to work tomorrow.

23:09 <azonenberg> Ok great

23:12 <_whitenotifier> [scopehal] azonenberg pushed 1 commit to master [+0/-0/±1] https://git.io/JLzOS

23:12 <_whitenotifier> [scopehal] azonenberg d9772b6 - scopehal: fix bug in handling of no-OpenCL systems

23:12 <_whitenotifier> [scopehal-apps] azonenberg pushed 1 commit to master [+0/-0/±1] https://git.io/JLzO9

23:12 <_whitenotifier> [scopehal-apps] azonenberg a02844d - Updated submodules

23:16 <miek> also i get a bunch of deprecation warnings suggesting to use cl2.hpp instead

23:16 juli966 has quit [Quit: Nettalk6 - www.ntalk.de]

23:17 <azonenberg> miek: Yes. Except i believe that nvidia doesn't support CL2 that may have changed)

23:17 <azonenberg> i'll look into that more

23:17 <miek> ahh right

23:18 <azonenberg> everything is a mess. basically for best results you need to use cuda on nvidia and opencl on amd

23:33 <azonenberg> anyway i just checked. as of 370.something drivers nv has beta support for cl 2.0 on some cards

23:33 <azonenberg> however only 1.2 is fully supported

23:52 <azonenberg> So basically depending on who you ask CL 2.0 is either obsolete or too new to support :p