#litex on 2019-09-13 — irc logs at freenode.irclog.whitequark.org

00:00 tpb has quit [Remote host closed the connection]

00:00 tpb has joined #litex

00:10 freemint has joined #litex

00:10 freemint has quit [Client Quit]

00:35 CarlFK has joined #litex

00:45 <xobs> _florent_: Another issue with doing it that way is reserved names -- if we do it that way then we can't have DCSRFields named `size`, `name`, `value`, etc. Which seems like an implementation detail leaking out.

02:02 _whitelogger has joined #litex

06:12 CarlFK has quit [Quit: Leaving.]

06:49 rohitksingh has quit [Ping timeout: 245 seconds]

08:07 CarlFK has joined #litex

14:25 <_florent_> xobs: i've just experimented a bit with csr, what do you think about that: https://github.com/enjoy-digital/dreg/blob/master/ecsr.py

14:25 <tpb> Title: dreg/ecsr.py at master · enjoy-digital/dreg · GitHub (at github.com)

14:25 <_florent_> https://github.com/enjoy-digital/dreg/blob/master/test.py

14:25 <tpb> Title: dreg/test.py at master · enjoy-digital/dreg · GitHub (at github.com)

14:26 <_florent_> this add field and description support to CSRs, we could just also modify CSRStorage/CSRStatus to support field natively

14:27 <_florent_> some of your usecases are still not covered: pulse, hidden and how to do write_from_dev with fields, but just wanted to know what you think

14:31 <xobs> It does handle gaps, which is nice.

14:34 <xobs> It also places everything under "fields", instead of separating it into "w" and "r". That's nice to know you can use "setattr" like that in Python -- that seems to contradict a Stack Overflow answer that I found.

14:35 <xobs> One of the nice things that I added to Fields are `values`, which can be directly mapped to SVF files: https://github.com/xobs/dreg/blob/master/dcsr.py#L290

14:35 <tpb> Title: dreg/dcsr.py at master · xobs/dreg · GitHub (at github.com)

14:39 <xobs> Also, I had "readable" and "writeable" as Field properties so that they could be mapped to `reg` diagrams to indicate fields that were write-only. I suppose the SVD specification has field values like "w1c", "w1s", and things like that.

14:39 <xobs> http://www.keil.com/pack/doc/cmsis/svd/html/elem_registers.html#elem_enumeratedValues

15:27 <_florent_> for the field property, you can pass it as **kwargs, but we could also create a class that inheritate from ECSRField with more parameters?

15:55 <xobs> That's what I was thinking, but for the actual field, at least in SVD, it's only (possible values, description) or (possible values, enum name, description).

16:15 <xobs> florent: I thought about making a subclass, but I think the tuple approach covers all cases.

16:16 <_florent_> for the values of fields?

16:16 <_florent_> of/or

17:36 rohitksingh has joined #litex

17:56 <somlo> _florent_: so I ran some benchmarks on litex+rocket (hardware FPU, 60MHz, on a nexys4ddr)

17:56 freemint has joined #litex

17:57 <somlo> _florent_: coremark came back with "71" (lowRISC at 50MHz gets 103, and a pentium 100MHz gets 212, per google)

17:57 <somlo> so I think it's time for me to try and sort out the rocket axi memory interface -- unlike the other supported CPUs in LiteX, Rocket has two dedicated AXI interfaces, one for mmio and the other for cached RAM accesses

17:58 <somlo> and I'd like to figure out a way to have LiteDRAM's CSRs stay behind on the CSR bus with all other peripherals, but expose a separate AXI interface that I'd connect to Rocket's dedicated cached-ram AXI, so that there's no contention

17:59 <somlo> and hopefully at 64-bit width, or more :)

17:59 <somlo> I guess I'll be staring at LiteX migen source code again, but any advice before I dive in would be much appreciated!

18:03 <somlo> funny side note, I actually scored an old original pentium-100 machine someone was about to recycle, and it still powers on. I'm waiting for a usb-to-ps2 keyboard dongle before I can run my own actual benchmark code on it, so I won't have to believe some historical numbers off the wild Internet :)

18:15 <daveshah> fwiw, VexRiscv claims 2.27 Coremark/Mhz

18:15 <daveshah> But I'm not sure what it gets in a LiteX SoC

18:18 <somlo> daveshah: I know right now rocket has to split each 64bit access in two, since the native LiteX bus is 32bit

18:19 <daveshah> Aaaa

18:19 <daveshah> that's no fun on anything with wideish DDR3

18:19 <somlo> that's why I included lowRISC, which uses the mig7series controller from xilinx, at native 64bit width

18:19 <daveshah> Even litedram is pretty wide

18:19 <somlo> not sure how much betther *that* could get, but it's a good target for optimization once I fix the bottleneck

18:20 <daveshah> on the TrellisBoard the DDR3 side width is 32*4 (x4 gearing ratio)

18:20 <somlo> the first priority was to get it working *at all* :) Now that that's done, it's time to actually handle all the technical debt we accumulated along the way...

18:21 <somlo> mmio and cached-ram axi masters sharing the same bus, and down-shifting from 64bit to 32bit word size on top of that

18:22 <somlo> I can probably leave the mmio side downshifting to 32 bits for now, if I can only figure out a way to separate the cached-ram axi interface and have it talk to LiteDRAM on a dedicated link

18:22 <daveshah> Something else I'm vaguely curious about is running the RAM faster than the CPU (litedram should easily manage 100MHz), but the added clock domain crossing latency might not be worth it

18:23 <somlo> daveshah: that, and whether I could widen Rocket's cached-RAM axi port beyond 64 bits

18:24 <somlo> since rocket has an internal L1, it will read/write a whole cache line in an AXI burst

18:24 <daveshah> In the long term, one would probably want some spare memory bandwidth for video output framebuffer accesses

18:24 <daveshah> That seems like it could match the DDR3 nicely

18:25 <somlo> yeah, lots of room to make things better -- my goal is to approach parity with the 100MHz Pentium from the mid-90s, of which I have fond memories :)

18:31 <somlo> daveshah: btw, fpu-less rocket at 60MHz, nexys4ddr (vivado) coremark 27, versa (yosys/trellis/nextpnr) coremark 33 :)

18:31 <daveshah> Huh

18:32 <somlo> can't fit the fpu on the versa, but wanted to see if all else was equal, would there be a difference between artix and ecp5

18:32 <daveshah> I thought litedram was faster on the artix7

18:32 <somlo> nexys4ddr has ddr2 memory, versa is ddr3, not sure if that explains it

18:32 <daveshah> possibly

18:33 <somlo> also not sure how much variability there is between multiple coremark runs on the same machine

18:33 <somlo> haven't done e.g. average over 10 runs, or anything like that

18:33 <somlo> so not quite 100% sure why -- maybe because yosys/trellis/nextpnr are awesome like that :)

18:34 <daveshah> Yeah, our magic pipeline rewriting algorithm :p

18:35 freemint has quit [Remote host closed the connection]

18:35 freemint has joined #litex

18:36 <somlo> anyhow, I'll have to give in and run coremark (and my other benchmarks too) multiple times, and figure out the standard deviation, and all that scientific stuff before publishing "official" numbers

18:37 <somlo> which is also why I'm not happy to just google the Pentium-100 numbers, and will run them for myself

19:01 rohitksingh has quit [Ping timeout: 245 seconds]

19:36 freemint has quit [Ping timeout: 245 seconds]

19:40 rohitksingh has joined #litex

19:58 <_florent_> somlo: yes now that it's working, we need to identify and remove the bottlenecks, that fact that we downconvert from 64 bit to 32 bit then up convert to the native DRAM width seems clearly a bottleneck, i'll look at what can be done there

20:02 <John_K> getting the following error when trying to use LiteEthPHYGMIIMII, is this a common / known issue?

20:03 <John_K> "ERROR:Place:1108 - A clock IOB / BUFGMUX clock component pair have been found that are not placed at an optimal clock IOB / BUFGMUX site pair."

20:03 <John_K> "The clock IOB component <eth_clocks_rx> is placed at site <AB11>. The corresponding BUFG component <eth_rx_clk_BUFG> is placed at site <BUFGMUX_X3Y7>."

20:13 <_florent_> how many bufg are used in your design, how many available?

20:17 <John_K> Number of BUFG/BUFGCTRLs: 4 out of 16 25%

20:24 <_florent_> you can probably get rid of the error with platform.add_platform_command("""PIN "eth_clock_rx" CLOCK_DEDICATED_ROUTE = FALSE;""") , but not sure it will be functional

20:35 <John_K> yeah I saw that in the error message and had the same thoughts

20:35 <John_K> is the Spartan 6 family really this bad or am I just running into regular-ish pains?

20:41 john_k[m] has quit [Remote host closed the connection]

20:42 nrossi has quit [Remote host closed the connection]

20:42 xobs has quit [Read error: Connection reset by peer]

20:42 freemint has joined #litex

20:49 xobs has joined #litex

21:10 nrossi has joined #litex

21:10 john_k[m] has joined #litex

21:43 rohitksingh has quit [Ping timeout: 265 seconds]

21:47 rohitksingh has joined #litex

21:51 <davidc__> John_K: the S6 should be plenty fast to speak MII/GMII/RGMII; I've done it with S3Es before

21:52 <davidc__> however you need to take care in terms of clock resource allocation; sounds like there might be some contention in your design

22:00 <daveshah> Maybe the Pano Logic designers messed up the pinout :p

22:00 <daveshah> We've all been there

23:17 <davidc__> there's also a few weird modes some phys support that do strange things with clocking

23:18 <davidc__> (ex, some phys support the master generating all the clocks and have an internal buffer to deal with the frequency delta)

23:19 <davidc__> Others support skewing clock->data for transmit, and the receive sample point

23:31 rohitksingh has quit [Ping timeout: 268 seconds]

23:42 rohitksingh has joined #litex