ChanServ changed the topic of #nmigen to: nMigen hardware description language · code at https://github.com/nmigen · logs at https://freenode.irclog.whitequark.org/nmigen
Degi_ has joined #nmigen
Degi has quit [Ping timeout: 256 seconds]
Degi_ is now known as Degi
<Vinalon> it looks like forwarding all of the bus signals manually with If/Else or Mux(...)s works, but I feel like that might not be the 'right' way to do it
Degi has quit [Ping timeout: 265 seconds]
Degi has joined #nmigen
Vinalon has quit [Remote host closed the connection]
Vinalon has joined #nmigen
_whitelogger has joined #nmigen
_whitelogger has joined #nmigen
chipmuenk has joined #nmigen
chipmuenk has quit [Client Quit]
chipmuenk has joined #nmigen
thinknok has joined #nmigen
<whitequark> Vinalon: well, the reason Decoder has that implementation is to conserve resources
<whitequark> if you don't want that for some reason (which I don't understand), then yes, If/Else is the way to do it
Asu has joined #nmigen
<_whitenotifier-3> [nmigen] sjolsen opened pull request #348: back.pysim performance improvements - https://git.io/JvA42
<_whitenotifier-3> [nmigen] whitequark commented on pull request #348: back.pysim performance improvements - https://git.io/JvA4d
<_whitenotifier-3> [nmigen] whitequark edited a comment on pull request #348: back.pysim performance improvements - https://git.io/JvA4d
<_whitenotifier-3> [nmigen] codecov[bot] commented on pull request #348: back.pysim performance improvements - https://git.io/JvA4F
<_whitenotifier-3> [nmigen] codecov[bot] edited a comment on pull request #348: back.pysim performance improvements - https://git.io/JvA4F
<_whitenotifier-3> [nmigen] codecov[bot] edited a comment on pull request #348: back.pysim performance improvements - https://git.io/JvA4F
<_whitenotifier-3> [nmigen] codecov[bot] edited a comment on pull request #348: back.pysim performance improvements - https://git.io/JvA4F
<_whitenotifier-3> [nmigen] whitequark reviewed pull request #348 commit - https://git.io/JvA4A
<_whitenotifier-3> [nmigen] sjolsen commented on pull request #348: back.pysim performance improvements - https://git.io/JvABR
____ has joined #nmigen
<_whitenotifier-3> [nmigen] whitequark commented on pull request #348: back.pysim performance improvements - https://git.io/JvABQ
<_whitenotifier-3> [nmigen] whitequark edited a comment on pull request #348: back.pysim performance improvements - https://git.io/JvABQ
<_whitenotifier-3> [nmigen] whitequark edited a comment on pull request #348: back.pysim performance improvements - https://git.io/JvABQ
hmn has joined #nmigen
hmn is now known as hhmmnn
<_whitenotifier-3> [nmigen] sjolsen synchronize pull request #348: back.pysim performance improvements - https://git.io/JvA42
Vinalon has quit [Ping timeout: 256 seconds]
<_whitenotifier-3> [nmigen] codecov[bot] edited a comment on pull request #348: back.pysim performance improvements - https://git.io/JvA4F
<_whitenotifier-3> [nmigen] codecov[bot] edited a comment on pull request #348: back.pysim performance improvements - https://git.io/JvA4F
<_whitenotifier-3> [nmigen] sjolsen commented on pull request #348: back.pysim performance improvements - https://git.io/JvARr
<_whitenotifier-3> [nmigen] codecov[bot] edited a comment on pull request #348: back.pysim performance improvements - https://git.io/JvA4F
<_whitenotifier-3> [nmigen/nmigen] whitequark pushed 2 commits to master [+0/-0/±2] https://git.io/JvARy
<_whitenotifier-3> [nmigen/nmigen] sjolsen 2398b79 - back.pysim: Reuse clock simulation commands
<_whitenotifier-3> [nmigen/nmigen] sjolsen 1e74409 - back.pysim: Eliminate duplicate dict lookup in VCD update
<_whitenotifier-3> [nmigen] whitequark commented on pull request #348: back.pysim performance improvements - https://git.io/JvARx
<_whitenotifier-3> [nmigen] Failure. 82.41% (+-0.29%) compared to bb1bbcc - https://codecov.io/gh/nmigen/nmigen/commit/1e744097ab6f7fb37c90e18b30c4aef28fd6be6b
<_whitenotifier-3> [nmigen] Success. 100.00% of diff hit (target 82.69%) - https://codecov.io/gh/nmigen/nmigen/commit/1e744097ab6f7fb37c90e18b30c4aef28fd6be6b
<_whitenotifier-3> [nmigen] Failure. 82.46% (+-0.24%) compared to bb1bbcc - https://codecov.io/gh/nmigen/nmigen/commit/1e744097ab6f7fb37c90e18b30c4aef28fd6be6b
<_whitenotifier-3> [nmigen] Success. 82.74% (+0.04%) compared to bb1bbcc - https://codecov.io/gh/nmigen/nmigen/commit/1e744097ab6f7fb37c90e18b30c4aef28fd6be6b
<_whitenotifier-3> [nmigen] sjolsen commented on pull request #348: back.pysim performance improvements - https://git.io/JvAEh
chipmuenk1 has joined #nmigen
chipmuenk has quit [Ping timeout: 260 seconds]
chipmuenk1 is now known as chipmuenk
hhmmnn has quit [Remote host closed the connection]
<_whitenotifier-3> [nmigen] sjolsen commented on pull request #348: back.pysim performance improvements - https://git.io/JvAwA
<_whitenotifier-3> [nmigen] whitequark commented on pull request #348: back.pysim performance improvements - https://git.io/JvArU
<_whitenotifier-3> [nmigen] whitequark commented on pull request #348: back.pysim performance improvements - https://git.io/JvArq
<_whitenotifier-3> [nmigen] sjolsen synchronize pull request #348: back.pysim performance improvements - https://git.io/JvA42
<_whitenotifier-3> [nmigen] codecov[bot] edited a comment on pull request #348: back.pysim performance improvements - https://git.io/JvA4F
<_whitenotifier-3> [nmigen] codecov[bot] edited a comment on pull request #348: back.pysim performance improvements - https://git.io/JvA4F
<_whitenotifier-3> [nmigen] codecov[bot] edited a comment on pull request #348: back.pysim performance improvements - https://git.io/JvA4F
lkcl_ has quit [Ping timeout: 265 seconds]
<_whitenotifier-3> [nmigen] whitequark commented on pull request #348: back.pysim performance improvements - https://git.io/JvArj
<_whitenotifier-3> [nmigen] whitequark commented on pull request #348: back.pysim performance improvements - https://git.io/JvAoU
chipmuenk has quit [Quit: chipmuenk]
lkcl has joined #nmigen
lkcl has quit [Ping timeout: 265 seconds]
lkcl has joined #nmigen
Vinalon has joined #nmigen
<Vinalon> well, I was using a Decoder to multiplex access to RAM (inside the chip) and NVM (outside the chip). The NVM takes a lot longer to access and starts an access when its 'stb' signal is asserted, and the RAM's 'ack' signal causes the bus to assert 'ack' before it finishes fetching data.
<Vinalon> so I need to switch those signals as well. I guess I'll stick with if/else then, thanks
<whitequark> that seems like a logic error elsewhere in the design
<whitequark> absolutely nothing should be happening unless cyc is asserted
<whitequark> that's why the other signals are not multiplexed
<Vinalon> oh...yeah, I've just been setting 'cyc' equal to 'stb' and driving 'stb' to mediate transactions. Thanks, this is what happens when I only skim the timing diagrams
<whitequark> we should have formal tests for that kind of thing
<Vinalon> so it sounds like I should make the subordinate buses not assert anything and ignore inputs if their 'cyc' signal isn't active? That makes sense.
<whitequark> but don't for now
<whitequark> yes
<Vinalon> well, that still wouldn't keep people like me from using the bus signals incorrectly. Thanks for the information!
<ZirconiumX> wq: when I was talking about my chess code you suggested using a resetless domain instead of passing reset_less to Signal; how do I do that?
<ZirconiumX> Presumably it involves DomainRenamer, right?
<whitequark> nope
<whitequark> are you currently not using any domains?
<ZirconiumX> Just the default comb and sync
<whitequark> try m.domains.sync = sync = ClockDomain(reset_less=True)
<whitequark> m.d.comb += sync.clk.eq(platform.request(platform.default_clk))
<whitequark> *.d.comb += sync.clk.eq(platform.request(platform.default_clk).i)
<ZirconiumX> Does that propagate into submodules?
<whitequark> yep
<whitequark> domains are global unless specified otherwise
<ZirconiumX> AttributeError: 'NoneType' object has no attribute 'request'
<ZirconiumX> I don't think this works when you're just using nMigen to dump Verilog output
<whitequark> oh, yeah
<whitequark> then ditch the platform part
<whitequark> it'll do the right thing
<ZirconiumX> Apparently not, because when I replace the sync domain Yosys optimises out my code
<ZirconiumX> as in, it synthesises to zero cells
<whitequark> do you use ports=[...]?
<whitequark> if yes, you need to add sync.clk there
<ZirconiumX> Right, okay.
chipmuenk has joined #nmigen
proteus-guy has quit [Ping timeout: 250 seconds]
proteus-guy has joined #nmigen
Vinalon has quit [Remote host closed the connection]
Vinalon has joined #nmigen
<ZirconiumX> Do you still need to create a new simulator if you want to run multiple tests with an Elaboratable?
<whitequark> nope!
<whitequark> you can reset the existing one
<whitequark> this was one of the features I worked towards with the pysim rewrite
<ZirconiumX> Except reset() doesn't clear processes, so you need a new simulator to add a new process.
<ZirconiumX> Unless I pipeline my tests, anyway.
<whitequark> hm
<whitequark> that seems like a major issue with this API
<whitequark> well
<whitequark> you could easily work around that by adding a level of indirection in your tests
<whitequark> like `yield from self.current_testcase()`
<whitequark> but it does seem like smoething I did not account for
<ZirconiumX> So I'm guessing the problem is more involved than an API that clears the internal process list?
<whitequark> well, you might want to keep some of those processes, if they're replacing e.g. a PHY with a behavioral model
<Vinalon> I toggle the clock domain's reset signal between individual tests inside of one process function, and it seems to work pretty well.
<whitequark> yup, that also works if you have no reset_less signals
<awygle> Ugh fine ill learn rosette, are you happy now?
<whitequark> llol
<awygle> (you keep tweeting Cool Shit)
futarisIRCcloud has joined #nmigen
chipmuenk has quit [Quit: chipmuenk]
<ZirconiumX> Is it possible to stop the simulation on a particular signal changing? (i.e. a done bit)
<ZirconiumX> I'm asking mostly because I have no idea how many cycles something will take
<whitequark> while not (yield sig): yield
<ZirconiumX> That works
<cr1901_modern> awygle: Yea I'm thinking of joining the Cool Kids and reinstalling Racket as well
<awygle> I have been meaning to try out SMT based code generation on a particular problem
<awygle> Was gonna try this thing that expressed x86 semantics in z3 in python
<cr1901_modern> Python bindings are hit or miss for me
<cr1901_modern> when they work, they're great. But getting them installed (on _Linux_, mind you) can be a pain. I don't remember the details
<cr1901_modern> so for once this isn't a Windoze problem
<cr1901_modern> https://rise4fun.com/Z3 This works in a pinch
<ZirconiumX> Now I have the fun of writing a 1024-bit popcount.
<cr1901_modern> in mnigen?
<cr1901_modern> err, nmigen
<ZirconiumX> Yes
<whitequark> ZirconiumX: literally just `sum(value)`
<ZirconiumX> ...Yeah, but what on earth does that synthesise to?
<whitequark> try it?
<ZirconiumX> RecursionError
<whitequark> yeah, sec
<cr1901_modern> 1024-bit popcount: 512 1-bit full adders, 256 2-bit full adders, 128 4-bit full adders, etc
<whitequark> sys.setrecursionlimit(10240)
____ has quit [Quit: Nettalk6 - www.ntalk.de]
<whitequark> the binary tree of adders might work better tho
<whitequark> not sure
<cr1901_modern> I don't even want to think about optimizing that damn thing tho
<ZirconiumX> I'm expecting the actual number of values to be < 256, though...
<ZirconiumX> 5 seconds just for this :P
<cr1901_modern> Then you write out the 256 values you care about into a table, mark the other 2^1024 - 256 as-don't cares, and do a 1204-bit K-map :)
<cr1901_modern> 1024*
<whitequark> cursed
<ZirconiumX> Not quite what I meant, but sure
<ZirconiumX> Answer: 2143 SB_LUT4s and 12 SB_CARRYs
<ZirconiumX> Rather I meant that "I'm expecting at most 256 populated bits within the 1024-bit input"
<ZirconiumX> # Ask not what your stack can do for you, ask what you can do for your stack
<daveshah> Does it need to be single-cycle?
<ZirconiumX> I suppose not, but it's going to be used pretty often
<ZirconiumX> At least for testing
<daveshah> I guess an iterative 1024 cycle approach would be no good then
<whitequark> lol
<whitequark> this is one of those things you would use retiming for, right?
<daveshah> Yeah, stick a few registers afterwards and let the tool put them in the best place
<daveshah> Some tools might even be able to infer cr1901_modern's tree structure
<daveshah> (although that is balancing rather than retiming)
<cr1901_modern> I can't fathom that the tree structure is timing friendly if you need single cycle output
<whitequark> well it sure as heck is better than my structure
<whitequark> which is a 1024 bit long chain of increasingly large adders
<whitequark> specifically the output is 1025 bits long because of nmigen integer promotion rules
<ZirconiumX> Eddie's static timing analysis gives a *pure logic* delay of 8ns :P
<ZirconiumX> (ice40HX)
<ZirconiumX> Wonder if setting the intended output width to 8 bits persuades Yosys to chop off some bits
<ZirconiumX> Answer: yes
<daveshah> 8ns seems like there might be some kind of tree going on already
<ZirconiumX> Well, we're deep in the middle of "autoname has no idea what to do" land
<ZirconiumX> 1116 o_SB_DFF_Q_D_SB_LUT4_O_I1_SB_LUT4_O_I3_SB_LUT4_O_I2_SB_LUT4_O_I1_SB_LUT4_O_I3_SB_LUT4_O_I1_SB_LUT4_O_I0_SB_LUT4
<ZirconiumX> _O_I3_SB_LUT4_O_I3_SB_LUT4_O_I2_SB_LUT4_O (SB_LUT4.I3->O)
<daveshah> How long is the longest path according to ltp?
<ZirconiumX> 25
<daveshah> Hmm, sounds a lot like a tree structure
<ZirconiumX> Wonder if ABC9 does any better here
<daveshah> As it is mostly pure logic with only a few carries, wouldn't expect a big difference
<ZirconiumX> 6.4ns
<ZirconiumX> So it's notable
<daveshah> So, looks like Yosys packs the whole chain into a $macc cell and then maccmap as part of techmap converts that into a tree
<daveshah> rarely, Yosys is cleverer than expected
<ZirconiumX> ltp with ABC9 is 23
<ZirconiumX> So it packed it slightly better
<ZirconiumX> Let's see how flowmap does!
<ZirconiumX> Better than ABC1 (7.6ns) and same depth as ABC9 (23), and not *that* much worse area-wise
<whitequark> nice
<ZirconiumX> I'm reading through the chess-programming wiki and there's a bit trick to turn a 2^N - 1 array of things to popcount into an N array of things to popcount after some bit manipulation
<ZirconiumX> So if I have 16 64-bit things to popcount (= 1024 bits), that can be turned to 4 64-bit things to popcount (= 256 bits)
<tpw_rules> don't you mean 5?
<ZirconiumX> You can apply the 3->2 trick again
Asu has quit [Ping timeout: 256 seconds]
futarisIRCcloud has quit [Quit: Connection closed for inactivity]