ChanServ changed the topic of #nmigen to: nMigen hardware description language · code at https://github.com/nmigen · logs at https://freenode.irclog.whitequark.org/nmigen
Degi has quit [Ping timeout: 272 seconds]
Degi has joined #nmigen
<_whitenotifier-c> [nmigen/nmigen-yosys] whitequark pushed 1 commit to master [+0/-0/±2] https://git.io/JfaLa
<_whitenotifier-c> [nmigen/nmigen-yosys] whitequark 03c80dc - Add ccache support for parity with yosys-pypi.
<kbeckmann> whitequark: Are you planning on supporting xdr>2 where a second clock is needed? I tried adding support for ODDRX2F for the ECP5 but had to change a bit in io.py to add the second clock/eclk and it gets a bit ECP5-specific.
<whitequark> kbeckmann: yes, but this will require some thought
<whitequark> actually, if you're acutely interested in the topic, I have a suggestion for you
<whitequark> right now the Pin() object doesn't state what the phase of DDR output is, compared with the clock
<whitequark> we should expose that information
<whitequark> in fact, we should at least collect it, to begin with
<kbeckmann> oh I see, yeah that would be useful.
<whitequark> if you collect it for the supported platforms (ideally with hw verification) then we could expose it through some sort of attribute
<whitequark> I have all of our supported hardware platforms on hand so we can check it there
<whitequark> with the exception of, I think, ultrascale, but that's still easy to arrange
<kbeckmann> oh that's great. I'll implement it for the ECP5 first since that's what I have personally.
Guest30583 has joined #nmigen
<whitequark> cool!
Asu has joined #nmigen
Guest30583 has quit [Quit: Nettalk6 - www.ntalk.de]
chipmuenk has joined #nmigen
electronic_eel has quit [Ping timeout: 258 seconds]
electronic_eel has joined #nmigen
Asuu has joined #nmigen
Asu has quit [Ping timeout: 264 seconds]
<smkz> if i want to use the nmigen simulator and extract signal values from the simulated logic (for processing / evaluation in python, not for making a .vcd); the right way to do so is by including code in the function fed to add_sync_process, right?
<whitequark> correct
<smkz> perfect
<smkz> thank you
<whitequark> in case you want to analyze simulation traces passively (as opposed to interacting with the logic or checking it as it goes) there is also https://github.com/nmigen/nmigen/issues/327
<whitequark> it's not currently implemented but it wouldn't be hard to add that at all
* smkz nods
<smkz> also is there a way to introspect into a module's internal Signals (rather than just the signals which are self.whatever defined in the __init__) from a process?
<whitequark> unfortunately not since they live inside the closure and so are hidden from any normal introspection
<whitequark> there ought to be a good way to do this but currently isn't
<whitequark> I would for now recommend exposing them through attributes but it is inelegant and we'll improve it
* smkz nods
<smkz> (thanks for answering my nmigen questions, i appreciate it a lot (as well as the work you do on nmigen))
<whitequark> no problem! glad i can help
<smkz> oh other question; is there a way to access the contents of a Memory from the process function? (ideally to grab all its state at once)
<whitequark> at once no, but you can do `yield mem[0]` where 0 is any index
<smkz> ahhh
<smkz> perfect ^^;
<whitequark> I'll keep the desire to be able to grab the entire thing in mind
<whitequark> right now it's using a highly inefficient approach (https://github.com/nmigen/nmigen/issues/359) but with the experience from cxxrtl I know how to improve it significantly
<whitequark> it could even then use a dense array rather than a list of ints
<smkz> cxxrtl is the thing where you convert the nmigen design into a c++ file for simulation that way?
<whitequark> any design yosys can parse, actually!
<whitequark> you can cosimulate nmigen code, verilog code and vhdl code with it
<whitequark> (synthesizable verilog and vhdl)
<smkz> neat!!
<smkz> will there be a way to get the results from the cxxrtl simulation for processing in python?
<whitequark> yep, that's actually what i'm working on right now
<whitequark> well
<whitequark> technically right now i am fixing bugs in the wasm toolchain but that's just three yak shaving levels deep into that task
<whitequark> see, using cxxrtl would require yosys from master, and it's a pain to require everyone to install it
<whitequark> so i thought i'd ship yosys in pypi
<whitequark> using cxxrtl from python*
<whitequark> though i guess no version of cxxrtl is released yet so it's true in general too
<awygle> lol poor wq is incapable of ignoring a wooly yak
<whitequark> awygle: look. at least today i was fixing bugs in wasi-libc which at least tangentially relate to the task at hadn
<whitequark> two days ago i was learning about windows x32 seh
<awygle> :)
<whitequark> four days ago i was looking into 32 bit x86 SIMD instructions
<whitequark> why 32 bit? because the only 64 bit shift 32-bit x86 has is a SIMD shift
<awygle> mhm
<awygle> 32-bit simd sucks even more than regular simd
<awygle> because you can't even assume SSE2
<whitequark> well
<whitequark> that's why cranelift's "i686" backend currently just crashes if you don't enable SSE4.1
<awygle> <_<
<whitequark> fun fact
<whitequark> psllq/psrlq are SSE2
<whitequark> but pinstr/pextr are SSE4.1
<whitequark> why the fuck?..
<awygle> yep
<whitequark> no but i mean why
<awygle> simd is a nightmare hellscape universally
<awygle> even Neon is bad
<whitequark> did they make it bad on purpose
<whitequark> did they seriously not consider people might want to extract elements from lanes
<awygle> which is a shame because simd is one of the coolest things
<whitequark> i never seriously looked into simd and tbh leaning towards keeping it that way
<whitequark> someone else with more tolerance for pain can do it, i'm sure
<ZirconiumX> "this time we'll do it right" - people who did not do it right
<awygle> i resolved the tension by just refusing to deal with anything that doesn't support AVX2
<whitequark> i know avx as "that thing i disable to get higher turbo frequency"
<whitequark> and measurable improvement in benchmarks (benchmark: 30 minutes of vivado junk) too
<awygle> AVX is horrifyingly non-orthogonal, but AVX2 fills in most of the gaps
<awygle> even if you only use it on 128-bit registers
<awygle> and therefore don't kill your turbo perf
<whitequark> hm
<whitequark> ok that's cool
<whitequark> who decided that glibc uses avx256 for memcpy?
<awygle> *shrug* RMS?
<awygle> what were you doing this benchmarking on? i thought Skylake was supposed to pay a lot less for AVX2
<whitequark> uh, i don't recall exactly
<whitequark> it's been two years or so
<awygle> mk
<awygle> Ryzen only just got 256-bit execution units so on Zen <2 you get really bad perf with 256-bit ops, no reason to use them, but AVX2 support is still useful for the 128-bit stuff
<sorear> I get the impression that there are a close to infinite number of simd instructions somebody wants
<sorear> and the glibc maintainers (rms not involved) can't turn down a new memcpy implementation that's x% faster on memcpy-only microbenchmarks
<awygle> do you _have_ to link to a libc to write a program for linux?
* awygle shakes away the terrible ideas brewing, gets back to work
<whitequark> awygle: nope!
<whitequark> now you're thinking with, uh, golang
<awygle> oh really? i didn't know go didn't use libc
<whitequark> they tried doing this on macos too and it broke *wonderfully*
<whitequark> absolute carnage
<awygle> that's actually kind of cool, it may be the only cool thing i've ever heard about go
<whitequark> so they had to choke on their pride and dynamically link libc on macos and windows after all
<awygle> yeah macos and the bsds are different
<awygle> unlike linux they don't consider the kernel the operating system
<whitequark> heh
<awygle> ... you know this why am i explaining it to you.
<whitequark> fun fact
<whitequark> you can make a 64-bit windows binary that does a far call into a 32-bit linux segment and then runs linux syscalls from it
<awygle> that's..... exactly what i want to do
<whitequark> in case you ever want me to install a rootkit that's probably how
<awygle> actually
<whitequark> what the actual fuck
<awygle> except for the 32-bit linux part
<awygle> 64-bit is fine
<whitequark> ok i mean it still bothers me substantially
<whitequark> what are you *doing*
<awygle> for posterity, here is my terrible idea - i want to write an N64 emulator, using mmap to emulate the TLB, which means i can't do it on Windows because of the 64kB segment size for MapViewOfFile. but in WSL 2, i can use mmap as god intended, but i need a way to make that seamless to the user. so my plan would be to run a WSL 2 process which does all the backend shit and do IPC to Windows for the GUI so the GUI can be native and i don't need X11.
<awygle> but i don't want to fuck around with distros or statically link musl, so.... raw syscalls, no libc, profit!
<whitequark> er
<whitequark> that doesn't apply to wsl. that applies to wine
<awygle> i figured lol. bummer
<whitequark> wsl (both 1 and 2) are lightweight virtualizations
<whitequark> i think the windows syscalls aren't even mapped inside the container
<awygle> 2 is significanly more virtual, is my understanding. 1 was more like Solaris LX branded Zones, where they wrote a syscall emulation table on top of NT. but 2 is just... a VM. at least this is my understanding.
<whitequark> not emulation
<whitequark> they added a new NT subsystem
<whitequark> it was basically like Interix
<awygle> right, ok
<whitequark> 2 is basically like CoLinux
<whitequark> in neither of these cases you could possibly use the GUI
<whitequark> in fact, you can't even use win32k from win32 console subsystem
<awygle> the thing you refer to with "basically like" should generally be _less_ obscure than the original thing :p
<whitequark> lol
<awygle> yeah hence the IPC bit. you can do an AF_UNIX socket from WSL 2 to Windows
<whitequark> oh hmm
<awygle> performance? :shrug: who knows
<whitequark> just map the same file from inside and outside of wsl
<whitequark> and use the socket as a mailbox
<awygle> i wonder if you can pass a shm between the two....
<whitequark> not sure, but i don't see why not
<awygle> really i wonder if you can map a shm into a windows process' space
<awygle> idk how to do that
<awygle> maybe MapViewOfFile would be good enough idk
<awygle> i'd have to test it
<awygle> and uh... .there are Many More Important Things i should be doing instead
<whitequark> haah
<whitequark> also
<awygle> (cr1901_modern do you have Cursed Knowledge here?)
<whitequark> is MapViewOfFile the reason wasm has a 64k page size?
<awygle> oh shit probably
<whitequark> I was wondering
<whitequark> I think it's *originally* related to NT on DEC Alpha?
<awygle> yeah something like that, there's a raymond chen blog post but i can't find it right now
<cr1901_modern> awygle: wasm's 64kB page size is the reason I haven't looked much further into using it as a VM for cursed vintage shit
<awygle> mm
<cr1901_modern> So no, I wouldn't have much to say about it
<whitequark> cr1901_modern: nah that's not really an issue afaik
<TD-Linux> awygle, fwiw intel *to this day* ships CPUs that don't support AVX2, so you can't generally make that assumption
<whitequark> people use wasm on microcontrollers that don't even have 64k of memory
<awygle> arright well time to pretend i'm useful and poke around LA design spaces for nmigen
<cr1901_modern> Hmmm, I do recall seeing that. But I was under the impression said applications don't allocate at all. Which I guess is fine for anything I want to do
<awygle> TD-Linux: yeah but nothing where i'd care about the performance gainst of AVX2/manual SIMD would run well on those anyway.
<whitequark> cr1901_modern: ahh possible
<whitequark> i mean... isn't one wasm page exactly one COM segment?
<whitequark> that actually seems pretty reasonable to me
<awygle> 4k is too small for pages :p
<cr1901_modern> yes lmao. But I wanted to try it on 65xx, and other friends w/ 16-bit addr space
<whitequark> oh yeah no that's not happening
<whitequark> you can't even target C to them
<awygle> aw bummer, you can't do AF_UNIX/SOCK_DGRAM on windows. or at least you couldn't when this article was written, maybe it's different now
<TD-Linux> awygle, yeah maybe. I mean they are modern 4ghz comet lake CPUs, so e.g. for dav1d we have to worry about them
<awygle> fair
<awygle> that's like, general utility software, whereas this is an emulator
<awygle> so i'm not too sad if i cut off some potential users
<awygle> i'd have different priorities for something like dav1d
<cr1901_modern> whitequark: Indeed. Although fun fact: I've been reading about TinyBASIC, one of the earlier BASICs, and a... well, Tiny one. Was developed in the mid 70s. >>
<awygle> my kingdom for a reliable packet-oriented interface :/
<cr1901_modern> The TinyBASIC interpreter is actually implemented as a VM. You're expected to implement the VM on your target to get a "free" TinyBASIC interpreter.
<cr1901_modern> (stack machine VM, like WASM)
<cr1901_modern> so the idea certainly goes back that far :P
<TD-Linux> pascal and pcode easily has that beat
<whitequark> cr1901_modern: wasm is uhhhh, you could call it a stack machine, sure
<whitequark> my understanding is that the two good reasons wasm is a stack machine is because it makes formal semantics tractable, and because the encoding is compact
<whitequark> but it's not a stack machine in the way forth or jvm are
<whitequark> here
<whitequark> that says it better than i can phrase right now
<cr1901_modern> Ahhh hrmmm
<whitequark> >This essentially makes WebAssembly a register machine without liveness analysis, but not only that, it’s a register machine that isn’t even in SSA form - both of the tools at our disposal to do optimisation are unavailable.
* cr1901_modern spends some time reading
<cr1901_modern> TD-Linux: I'm sure p-code is cool too, but I've not studied it (nor pascal) much.
<cr1901_modern> I went on a TinyBASIC diversion about a week ago b/c I read that even BASIC had trouble fitting into some home computers in the 70s. Which surprised me.
<cr1901_modern> And TinyBASIC was a compromise- trade speed to fit into small places via a carefully-designed stack machine VM
<awygle> whitequark: the difference between `lookup` and `request` is that lookup can be called multiple times, right? when is `lookup` intended to be used?
<whitequark> basically never, it's just there in case you somehow need it
<whitequark> and because it's needed internally
<sorear> whitequark: some ppc/mips/arm hardware strongly wants 64kb pages I think?
<awygle> ok
<whitequark> mhm
Cynthia has joined #nmigen
<awygle> so something like FFSynchronizer takes a string for the domain, and is a module, and creates the domain in its module object, and then after everything's elaborated the domains of the modules and submodules all the way up to the root are unified? is that approximately correct?
<whitequark> almost
<whitequark> FFSynchronizer doesn't create the domain
<whitequark> `m.d.foo += ...` doesn't *create* foo
<whitequark> it's like referencing an external symbol in C
<awygle> ah. so somebody somewhere is doing `m.domains += ClockDomain("foo")`
<awygle> and if not it'll be an error
<whitequark> yes.
<whitequark> the manual i'm not currently writing will explain that in painstaking detail
<awygle> and that somebody has to be above the FFSynchronizer in the hierarchy? or no
<whitequark> anywhere
<awygle> hm ok
<whitequark> it's how domains work in migen
<whitequark> fairly reasonable overall, though lack of local domains in migen hurts it significantly
<awygle> you are in a maze of twisty little _ModuleBuilder*s, all slightly different
<awygle> oh god you tweeted the dumb thing i said and now i have 20 notifications lol
<awygle> (to be clear this is fine it's just amusing)
thinknok has joined #nmigen
chipmuenk has quit [Quit: chipmuenk]
Asuu has quit [Quit: Konversation terminated!]
thinknok has quit [Ping timeout: 246 seconds]
<smkz> i have additional question; if i have a memory i want to introspect into during simulation; i know how to do so if the memory is defined directly in the module i'm feeding to the simulator,
<smkz> but in this case the memory is defined inside a submodule of a submodule of the module i'm simulating
<smkz> how would i thread it through / how would i access it?
<smkz> i've tried a few things like doing "self.memory_introspection_port = my_memory" inside the elaborate() function where it's defined
<smkz> but that doesn't seem to create something that can be accessed by the module that has that module as its submodule ;;