#litex on 2020-05-06 — irc logs at freenode.irclog.whitequark.org

2020-02-07 11:13 _florent_ changed the topic of #litex to: LiteX FPGA SoC builder and Cores / Github : https://github.com/enjoy-digital, https://github.com/litex-hub / Logs: https://freenode.irclog.whitequark.org/litex

00:00 tpb has quit [Remote host closed the connection]

00:00 tpb has joined #litex

00:10 <benh> yeah, lfsr is much faster indeed

00:10 <benh> I'll send a patch

00:33 HoloIRCUser1 has joined #litex

00:34 HoloIRCUser2 has joined #litex

00:38 HoloIRCUser has quit [Ping timeout: 256 seconds]

00:38 HoloIRCUser1 has quit [Ping timeout: 272 seconds]

01:05 CarlFK has quit [Ping timeout: 260 seconds]

01:13 futarisIRCcloud has quit [Quit: Connection closed for inactivity]

01:38 CarlFK has joined #litex

02:03 <shuffle2> would it make sense to use dsp blocks for ip checksum on ecp5? clarity has an "adder_tree" which results in ALU54Bs+MULT18X18Ds..i dont see perf/operation really described anywhere tho

02:04 <shuffle2> alternatively i can just hack it up as most of header is static in my case, but Feels Bad :)

02:15 Skip has quit [Remote host closed the connection]

03:43 Degi has quit [Ping timeout: 272 seconds]

03:44 Degi has joined #litex

03:57 CarlFK has quit [Ping timeout: 260 seconds]

04:11 _whitelogger has joined #litex

04:30 FFY00 has quit [Read error: Connection reset by peer]

04:31 FFY00 has joined #litex

04:33 HoloIRCUser has joined #litex

04:34 HoloIRCUser2 has quit [Ping timeout: 265 seconds]

04:41 awordnot has quit [Ping timeout: 240 seconds]

04:41 awordnot has joined #litex

05:35 CarlFK has joined #litex

05:41 HoloIRCUser1 has joined #litex

05:45 HoloIRCUser has quit [Ping timeout: 256 seconds]

05:45 futarisIRCcloud has joined #litex

05:55 <_florent_> scanakci: nice! congrats

05:57 <_florent_> benh: not sure we tried to optimize the memtest speed since it was fast enough with the others CPU, but that could now indeed make sense to have a closer look

06:35 HoloIRCUser has joined #litex

06:36 HoloIRCUser1 has quit [Ping timeout: 272 seconds]

07:45 <benh> _florent_: oh I just used the lfsr that Anton wrote for microwatt test suite and it's a lot faster now

07:46 <benh> _florent_: VexRiscV Mini takes a couple of seconds to init the Arty now. It was a lot slower before, maybe 5 to 10s ?

07:46 <benh> _florent_: I'll clean it up and send you the patch tonight (hopefully)

07:46 <benh> _florent_: it's as good a rng as the multiplication method when it comes to generating memory test patterns I reckon

08:23 <benh> _florent_: sent

08:24 <_florent_> benh: thanks!

08:29 HoloIRCUser1 has joined #litex

08:31 HoloIRCUser has quit [Ping timeout: 272 seconds]

08:41 <benh> _florent_: for 64-bit CPUs, would it make sense to have a topology with a 64-bit WB going out with 2 slave legs, a 64-bit one to memory and a converted-to-32-bit one for all the IOs ?

08:42 <benh> _florent_: also powerpc64 is really meant to be used on fully cache coherent systems (I could elaborate on the reasons if you want)

08:42 <benh> _florent_: so at some point, we might need to figure out how to implement cache coherent DMA in LiteX :)

08:42 <benh> _florent_: my initial plan was to do something like a snoop-fifo where all addresses go back to the core, and have the core 'sync instruction go out as a special signal that waits for that fifo to drain

08:43 <benh> (well or drain anything prior to that signal being asserted)

08:43 <_florent_> benh: we are currently working on similar things with VexRiscv SMP

08:43 <benh> but it's just ideas so far

08:43 <sorear> non-coherence on riscv sounds like a bad time

08:43 <benh> ah SMP would require a real cache coherency protocol

08:43 <benh> yeah non-coherence is a mess on CPUs with speculative loads

08:44 <sorear> especially when there are 0 data cache instructions in your ISA

08:44 <_florent_> VexRiscv SMP has two dedicated instruction/data ports directly connected to the DRAM with a larger data-width (128-bit for now to ease testing)

08:44 <benh> bcs you almost always end up with weird collisions between cachable and non-cachable loads

08:44 <benh> happens on ARM since v7, powerpc, ...

08:44 <benh> sorear: yeah that doesn't help :-)

08:44 <benh> _florent_: some kind of MERSI protocol ?

08:45 <_florent_> we are also going to work on cache coherent DMA, so most of it could probably be reused by Microwatt later

08:45 <benh> ok good

08:45 <benh> thx

08:46 <benh> one thing to avoid btw... I noticed it's somewhat doable currently

08:46 <benh> is have the control reg path to the DMA engine be a separate bus from the data path where DMAs occur

08:46 <_florent_> the implementation still has to be discussed with Charles that is doing the Vexriscv/SMP work

08:46 <benh> it''s a recipe for interesting ordering issues

08:47 <benh> for example that happened on some IBM Cell that used a sideband bus to control the ethernet DMA engine

08:47 <benh> you would stop the engine via that (CSR style)

08:47 <benh> but the writes it did to memory migght still be in various pipeline/bridge buffers and have not reached memory yet

08:48 <benh> howveer, SW has already freed the memory and given it up to something else, it then gets corrupted by DMA data

08:48 <benh> ie, unless the control registers are on the same data path as the DMA, or some other mechanism allows to ensure that the full DMA path to memory has been flushed, that problem will potentially exist

08:49 <benh> on things like PCIe it's typically a non-issue despite fairly large bridge induced latencies because the control path is ordered vs the data path, so for example, reading a DMA status reg will have the read response behind all previous DMA writes to memory

08:49 <benh> by the time the CPU gets it, all the previous DMAs have hit coherency

08:49 <benh> with control is via some CSR bus that might not be on the same path as the DMA -> memory path, you lose that property

08:53 <_florent_> thanks interesting, we'll have indeed to be careful on these things

08:54 <benh> the best way is to ensure that the control path from the CPU to a device (MMIO) is ordered in some way with the DMA data path from that device to memory

08:54 <benh> yup, it's bitten folks in the past :)

08:54 <benh> when adapting esp. old school "simple" design to more recent CPUs, adding bridges etc...

08:55 <benh> for example powerpc 4xx embedded has a "DCR" bus (a bit like CSRs but special core instructions)

08:55 <benh> that's a complete sideband

08:55 <benh> it used that to talk to the DMA engine

08:55 <benh> that was ok when there was only simple busses and generally, no much buferring

08:55 <benh> but that whole architecture was then ported to an IO chip for the Cell processor with 3 layers of bridging and pipelining

08:56 <benh> and hell broke lose

09:02 <benh> the most typical example is probably old school PCI devices (before MSIs)

09:02 <benh> a device writes a packet to memory, then sends an interrupt

09:02 <benh> the interrupt is a wire, so out of band, it often arrives before the DMA data has reached coherency (or a point where it's visible to the CPU)

09:03 <benh> it could be in pipeline buffers on the bus or anything

09:03 <benh> the CPU gets the interrupt and reads an MMIO register from the device, usually some kind of interrupt status

09:03 FFY00 has quit [Ping timeout: 244 seconds]

09:03 FFY00 has joined #litex

09:04 <benh> what a lot of folks didn't realize is that the key purposes of that read is not only to know what happened

09:04 <benh> but to have the response to that read be queued behind all the previous DMAs done by the device so that by the time the CPU gets it

09:04 <benh> it will also "see" all the DMA data

09:04 <benh> now, apologies if I haven't fully understood the LiteX design, but from what I've seen, it *seems* like your DMA engine has its own port to the memory controller

09:04 <benh> thus is not ordered vs CSRs to device that can trigger dMA

09:05 <benh> as long as there isn't much bufferring/pipelining it's probably fine

09:05 <benh> but in a world of delays introduced by cache coherency protocols etc... that can quickly fall appart

09:05 <benh> I hope I'm clear :) Otherwise let me know

09:11 FFY00 has quit [Ping timeout: 260 seconds]

09:12 FFY00 has joined #litex

09:19 <shuffle2> LiteEthPHYHWReset is just a delay counter. does this exist for some specific reason?

09:21 <shuffle2> oh, nvm

09:45 <scanakci> thanks _florent_ .

09:53 <scanakci> Instead of having a README under cpu/cores/blackparrot (https://github.com/enjoy-digital/litex/tree/master/litex/soc/cores/cpu/blackparrot), I am planning to have another repo similar to https://github.com/litex-hub/linux-on-litex-vexriscv

09:53 <tpb> Title: litex/litex/soc/cores/cpu/blackparrot at master · enjoy-digital/litex · GitHub (at github.com)

09:53 <scanakci> Does that work for you @_florent_?

09:54 <dkozel> xobs: The fixes to the wishbone-tool work, thanks

09:54 <_florent_> scanakci: yes sure, do you still plan to have a version integrated in LiteX?

09:55 <xobs> dkozel: Great! I'll tag wishbone-tool 0.6.17 then

09:55 <scanakci> I think it is better to remove it since any other cores do not.

09:56 <scanakci> However, I do not mind having a short README in LiteX repo as well.

10:00 <dkozel> xobs: I have a feature suggestion just before you do that

10:00 <dkozel> I'm almost done typing it up into the issue report

10:01 <xobs> dkozel: Alright. I added a patch for v0.6.17, but haven't pushed the tag yet.

10:01 <dkozel> https://github.com/litex-hub/wishbone-utils/issues/25#issuecomment-624555355

10:01 <tpb> Title: Add Litepcie bridge support to wishbone-tool · Issue #25 · litex-hub/wishbone-utils · GitHub (at github.com)

10:02 <dkozel> I almost implemented a PR for it, but came up a bit short on idomatically filtering the HashMap

10:50 <_florent_> dkozel: i received the Acorn CLE 215+ this morning, it's working fine :) : https://twitter.com/enjoy_digital/status/1257985111469015040

10:50 <dkozel> Awesome!

10:51 <dkozel> Mine should arrive tomorrow but I don't have any way of using it until I find an interface that will work outside of the computer

10:51 <dkozel> the heatsink/fan is too large for my desktop's m.2 slot and I don't have room for another full size PCIe card.

10:52 <_florent_> you also need a specific cable for the JTAG (i soldered it on mine, but will try to find/order a cable)

10:53 <dkozel> Thus my interest in the USB 3 PIPE interface or an m.2 thunderbolt enclosure. I've been very confused about the seeming specificity of NVME support in the adapters

10:53 <dkozel> https://www.digikey.com/product-detail/en/molex/0369200601/WM26622-ND/10233018

10:53 <tpb> Title: 0369200601 Molex | Cable Assemblies | DigiKey (at www.digikey.com)

10:53 <dkozel> JTAG

10:53 <dkozel> Still needs some soldering, but has the connector at least

10:54 <_florent_> dkozel: yes that's better than doing the soldering directly on the connector as i did :)

10:54 <_florent_> dkozel: thanks for the reference/link

10:55 <dkozel> No problem :) Thanks for pointing me towards the board.

10:55 <_florent_> dkozel: the Aller does not have a heatsink on the FPGA?

10:59 <_florent_> dkozel: the FPGA on the Aller is exactly the same as the one on the Acorn, so a similar heatsink could be used

11:00 <daveshah> I guess the thermals are designed for mining rather than general use

11:03 <dkozel> https://twitter.com/derekkozel/status/1257815420024893441

11:03 <dkozel> I'm having issues with the Aller's heat and heatsinking

11:03 <dkozel> it's idling at 84.9 °C right now with the PCIe etc initialized but idle.

11:07 <_florent_> dkozel: indeed, it's a bit hot, that's probably better to keep the "mining" heatsink then

11:16 <shuffle2> it turns out you dont need mdio to see rgmii phy status: https://gist.github.com/shuffle2/e53d3b24e12bfd2fa418068cd975d8ce is this an OK way to add it (would this be accepted as PR)?

11:16 <tpb> Title: rgmii_phy_status.diff · GitHub (at gist.github.com)

11:18 <shuffle2> (i wanted to keep some other modules in reset if link is down)

11:44 <shuffle2> is there a way to chain reset of some clock domains from another? when i click reset button i'd like eth_tx/eth_rx to reset as well as sys (it's currently tied to sys via AsyncResetSynchronizer)

11:52 <shuffle2> eh, using ResetSignal('sys') is good enough i guess

12:12 gregdavill has quit [Ping timeout: 240 seconds]

13:07 FFY00 has quit [Remote host closed the connection]

13:09 FFY00 has joined #litex

13:15 <_florent_> shuffle2: yes sure it would be possible to integrate it, this could be an optional module

13:24 <dkozel> xobs: Works, perfect.

13:25 <xobs> dkozel: Released! (Or at least pushed the tags. Give it five minutes or so.)

13:25 <dkozel> That last commit was a real reshuffle. Thanks for all the time that must have taken.

13:26 <xobs> It seemed the best way to do it.

13:26 <xobs> As a result, offsets are computed for everything, including uart and gdb fields.

13:28 <dkozel> gdb I haven't tried yet, but terminal worked.

13:31 <zyp> xobs, is «load» in gdb supposed to work?

13:33 <zyp> it doesn't seem to be working properly for me, but I haven't looked into why yet, I guess it might not be fully implemented so I figured I'd ask before spending time to only discover that

13:33 <xobs> zyp: nope. it's unclear how that would work.

13:33 <zyp> how so?

13:34 <xobs> Well, if your program is XIP, then it would need to know how to program flash. And I gather most programs are XIP.

13:34 <xobs> I guess if it's entirely in RAM that would work.

13:34 <xobs> But no, it hasn't been implemented yet.

13:36 <zyp> ah, yes, flash would require knowledge of the specific flash, I was thinking ram

13:37 <zyp> I didn't know litex supported XIP flash, does it?

13:37 <xobs> It does!

13:37 <xobs> I haven't looked into the new litespi.

13:37 <xobs> But the original spi core is at https://github.com/enjoy-digital/litex/blob/master/litex/soc/cores/spi.py

13:37 <tpb> Title: litex/spi.py at master · enjoy-digital/litex · GitHub (at github.com)

13:38 <xobs> Fomu calls it `lxspi`, and the register set is at https://rm.fomu.im/lxspi.html but it's also memory-mapped

13:38 <tpb> Title: LXSPI Fomu Bootloader documentation (at rm.fomu.im)

13:38 <zyp> I'll have to try that at some point

13:40 <xobs> litespi is at https://github.com/litex-hub/litespi/ and others here may be better able to comment on it.

13:40 <tpb> Title: GitHub - litex-hub/litespi: Small footprint and configurable SPI core (at github.com)

13:42 <zyp> I've written flash programming code for a couple of various microcontrollers for black magic probe some years ago, so maybe I'll have a go at adding that at some point then

13:44 <zyp> shouldn't be too hard to find the flash from the csr.csv

13:44 <xobs> Yep, you probably can figure out what it is by comparing names.

13:45 <xobs> For example, if there are three addresses next to each other called `???_bitbang`, `???_miso`, and `???_bitbang_en`, then that's probably a litex SPI block.

13:46 <xobs> What to do when you encounter two blocks that match that is left as an exercise to the implementer :P

13:46 <zyp> map both, obviously

13:46 <xobs> Oh, good point. Obviously.

13:47 FFY00 has quit [Remote host closed the connection]

13:47 <zyp> but yeah, one thing I miss in the csr.csv is a «type» field

13:50 FFY00 has joined #litex

13:50 <rvense> been trying again to get my ice40 hx8k evb working with litex, with the stub firmware and an lm32 processor, i get nothing on the serial port, but i do see a slow binary count on the LEDs, does anyone know where that might come from?

13:51 <xobs> rvense: address lines somehow got wired to the LEDs, and it's falling along a nop sled?

13:52 <rvense> does it even expose its address bus by default?

13:54 <xobs> No, but apparently 0xffffffff is `cmpne ba, ba, ba`, which could effectively be a nop.

13:54 <xobs> You could take a look at `top.v` and trace backwards what you have mapped to the io pins.

13:54 <rvense> good point

13:55 <xobs> While consulting `top.pcf`to make sure they're assigned to what you think they are.

13:55 <rvense> yeah, i'll look at that later, thanks

14:00 CarlFK has quit [Ping timeout: 260 seconds]

14:01 Skip has joined #litex

14:15 CarlFK has joined #litex

15:47 <tmbinc> are there tools that parse VCD/FST and extract for example wishbone transfers, vexrisc PC traces and things like that? I'm running a (bit hacked up) lxsim SoC with an existing RISCV binary (for which I want to build an emulation environment)

15:48 <tmbinc> i can manually inspect the --trace and usually figure out why it's crashing (mostly unimplemented peripherals etc.), but is there a more automated way?

16:42 <disasm[m]> Tockilator probably

17:04 <dkozel> tmbinc: Sigrok and GTKWave

17:06 <dkozel> https://github.com/im-tomu/valentyusb/tree/master/sim

17:06 <tpb> Title: valentyusb/sim at master · im-tomu/valentyusb · GitHub (at github.com)

17:07 <dkozel> Looking in here there is at least pieces of the tooling needed

17:07 HoloIRCUser has joined #litex

17:10 HoloIRCUser1 has quit [Ping timeout: 246 seconds]

17:24 darren099 has joined #litex

19:00 robert2 has joined #litex

19:00 robert2 is now known as rw1nkler

20:03 Skip has quit [Remote host closed the connection]

21:55 <benh> _florent_: I think you broke sdram inits when extending the use of lfsr to other functions

21:56 <benh> -prv = 1664525*prv + 1013904223;

21:56 <benh> +return lfsr(32, seed);

21:56 <benh> s/seed/prv

21:57 <benh> also in general, it might be worthwhile to document that seed must be non-0, something maybe to add to the comments near lfsr definition

22:09 <benh> fix sent

22:27 Skip has joined #litex

22:50 Thierryscv has joined #litex

22:51 Thierryscv has quit [Remote host closed the connection]

23:05 Skip has quit [Remote host closed the connection]

23:13 rw1nkler has quit [Remote host closed the connection]

23:34 gregdavill has joined #litex