tpb has quit [Remote host closed the connection]
tpb has joined #litex
freemint has joined #litex
freemint has quit [Client Quit]
CarlFK has joined #litex
<xobs> _florent_: Another issue with doing it that way is reserved names -- if we do it that way then we can't have DCSRFields named `size`, `name`, `value`, etc. Which seems like an implementation detail leaking out.
_whitelogger has joined #litex
CarlFK has quit [Quit: Leaving.]
rohitksingh has quit [Ping timeout: 245 seconds]
CarlFK has joined #litex
<_florent_> xobs: i've just experimented a bit with csr, what do you think about that: https://github.com/enjoy-digital/dreg/blob/master/ecsr.py
<tpb> Title: dreg/ecsr.py at master · enjoy-digital/dreg · GitHub (at github.com)
<tpb> Title: dreg/test.py at master · enjoy-digital/dreg · GitHub (at github.com)
<_florent_> this add field and description support to CSRs, we could just also modify CSRStorage/CSRStatus to support field natively
<_florent_> some of your usecases are still not covered: pulse, hidden and how to do write_from_dev with fields, but just wanted to know what you think
<xobs> It does handle gaps, which is nice.
<xobs> It also places everything under "fields", instead of separating it into "w" and "r". That's nice to know you can use "setattr" like that in Python -- that seems to contradict a Stack Overflow answer that I found.
<xobs> One of the nice things that I added to Fields are `values`, which can be directly mapped to SVF files: https://github.com/xobs/dreg/blob/master/dcsr.py#L290
<tpb> Title: dreg/dcsr.py at master · xobs/dreg · GitHub (at github.com)
<xobs> Also, I had "readable" and "writeable" as Field properties so that they could be mapped to `reg` diagrams to indicate fields that were write-only. I suppose the SVD specification has field values like "w1c", "w1s", and things like that.
<_florent_> for the field property, you can pass it as **kwargs, but we could also create a class that inheritate from ECSRField with more parameters?
<xobs> That's what I was thinking, but for the actual field, at least in SVD, it's only (possible values, description) or (possible values, enum name, description).
<xobs> florent: I thought about making a subclass, but I think the tuple approach covers all cases.
<_florent_> for the values of fields?
<_florent_> of/or
rohitksingh has joined #litex
<somlo> _florent_: so I ran some benchmarks on litex+rocket (hardware FPU, 60MHz, on a nexys4ddr)
freemint has joined #litex
<somlo> _florent_: coremark came back with "71" (lowRISC at 50MHz gets 103, and a pentium 100MHz gets 212, per google)
<somlo> so I think it's time for me to try and sort out the rocket axi memory interface -- unlike the other supported CPUs in LiteX, Rocket has two dedicated AXI interfaces, one for mmio and the other for cached RAM accesses
<somlo> and I'd like to figure out a way to have LiteDRAM's CSRs stay behind on the CSR bus with all other peripherals, but expose a separate AXI interface that I'd connect to Rocket's dedicated cached-ram AXI, so that there's no contention
<somlo> and hopefully at 64-bit width, or more :)
<somlo> I guess I'll be staring at LiteX migen source code again, but any advice before I dive in would be much appreciated!
<somlo> funny side note, I actually scored an old original pentium-100 machine someone was about to recycle, and it still powers on. I'm waiting for a usb-to-ps2 keyboard dongle before I can run my own actual benchmark code on it, so I won't have to believe some historical numbers off the wild Internet :)
<daveshah> fwiw, VexRiscv claims 2.27 Coremark/Mhz
<daveshah> But I'm not sure what it gets in a LiteX SoC
<somlo> daveshah: I know right now rocket has to split each 64bit access in two, since the native LiteX bus is 32bit
<daveshah> Aaaa
<daveshah> that's no fun on anything with wideish DDR3
<somlo> that's why I included lowRISC, which uses the mig7series controller from xilinx, at native 64bit width
<daveshah> Even litedram is pretty wide
<somlo> not sure how much betther *that* could get, but it's a good target for optimization once I fix the bottleneck
<daveshah> on the TrellisBoard the DDR3 side width is 32*4 (x4 gearing ratio)
<somlo> the first priority was to get it working *at all* :) Now that that's done, it's time to actually handle all the technical debt we accumulated along the way...
<somlo> mmio and cached-ram axi masters sharing the same bus, and down-shifting from 64bit to 32bit word size on top of that
<somlo> I can probably leave the mmio side downshifting to 32 bits for now, if I can only figure out a way to separate the cached-ram axi interface and have it talk to LiteDRAM on a dedicated link
<daveshah> Something else I'm vaguely curious about is running the RAM faster than the CPU (litedram should easily manage 100MHz), but the added clock domain crossing latency might not be worth it
<somlo> daveshah: that, and whether I could widen Rocket's cached-RAM axi port beyond 64 bits
<somlo> since rocket has an internal L1, it will read/write a whole cache line in an AXI burst
<daveshah> In the long term, one would probably want some spare memory bandwidth for video output framebuffer accesses
<daveshah> That seems like it could match the DDR3 nicely
<somlo> yeah, lots of room to make things better -- my goal is to approach parity with the 100MHz Pentium from the mid-90s, of which I have fond memories :)
<somlo> daveshah: btw, fpu-less rocket at 60MHz, nexys4ddr (vivado) coremark 27, versa (yosys/trellis/nextpnr) coremark 33 :)
<daveshah> Huh
<somlo> can't fit the fpu on the versa, but wanted to see if all else was equal, would there be a difference between artix and ecp5
<daveshah> I thought litedram was faster on the artix7
<somlo> nexys4ddr has ddr2 memory, versa is ddr3, not sure if that explains it
<daveshah> possibly
<somlo> also not sure how much variability there is between multiple coremark runs on the same machine
<somlo> haven't done e.g. average over 10 runs, or anything like that
<somlo> so not quite 100% sure why -- maybe because yosys/trellis/nextpnr are awesome like that :)
<daveshah> Yeah, our magic pipeline rewriting algorithm :p
freemint has quit [Remote host closed the connection]
freemint has joined #litex
<somlo> anyhow, I'll have to give in and run coremark (and my other benchmarks too) multiple times, and figure out the standard deviation, and all that scientific stuff before publishing "official" numbers
<somlo> which is also why I'm not happy to just google the Pentium-100 numbers, and will run them for myself
rohitksingh has quit [Ping timeout: 245 seconds]
freemint has quit [Ping timeout: 245 seconds]
rohitksingh has joined #litex
<_florent_> somlo: yes now that it's working, we need to identify and remove the bottlenecks, that fact that we downconvert from 64 bit to 32 bit then up convert to the native DRAM width seems clearly a bottleneck, i'll look at what can be done there
<John_K> getting the following error when trying to use LiteEthPHYGMIIMII, is this a common / known issue?
<John_K> "ERROR:Place:1108 - A clock IOB / BUFGMUX clock component pair have been found that are not placed at an optimal clock IOB / BUFGMUX site pair."
<John_K> "The clock IOB component <eth_clocks_rx> is placed at site <AB11>. The corresponding BUFG component <eth_rx_clk_BUFG> is placed at site <BUFGMUX_X3Y7>."
<_florent_> how many bufg are used in your design, how many available?
<John_K> Number of BUFG/BUFGCTRLs: 4 out of 16 25%
<_florent_> you can probably get rid of the error with platform.add_platform_command("""PIN "eth_clock_rx" CLOCK_DEDICATED_ROUTE = FALSE;""") , but not sure it will be functional
<John_K> yeah I saw that in the error message and had the same thoughts
<John_K> is the Spartan 6 family really this bad or am I just running into regular-ish pains?
john_k[m] has quit [Remote host closed the connection]
nrossi has quit [Remote host closed the connection]
xobs has quit [Read error: Connection reset by peer]
freemint has joined #litex
xobs has joined #litex
nrossi has joined #litex
john_k[m] has joined #litex
rohitksingh has quit [Ping timeout: 265 seconds]
rohitksingh has joined #litex
<davidc__> John_K: the S6 should be plenty fast to speak MII/GMII/RGMII; I've done it with S3Es before
<davidc__> however you need to take care in terms of clock resource allocation; sounds like there might be some contention in your design
<daveshah> Maybe the Pano Logic designers messed up the pinout :p
<daveshah> We've all been there
<davidc__> there's also a few weird modes some phys support that do strange things with clocking
<davidc__> (ex, some phys support the master generating all the clocks and have an internal buffer to deal with the frequency delta)
<davidc__> Others support skewing clock->data for transmit, and the receive sample point
rohitksingh has quit [Ping timeout: 268 seconds]
rohitksingh has joined #litex