<scanakci>
I will try rocket as a first step with ethernet interface to make sure that the issue is not related to BlackParrot. I will also try another switch I guess.
<scanakci>
@daveshah: thanks. will keep in my mind
rohitksingh has quit [Ping timeout: 265 seconds]
rohitksingh has joined #litex
<somlo>
_florent_: as of right now, litex commit 4d761e1a and liteeth master (466223e) both boots linux (incl. the liteeth driver) *and* is able to use the interface from within linux (ifconfig, wget, etc.)
<somlo>
that's litex before the new soc commits
<somlo>
I'll try the "leaky" bisect thing on litex again, to try and narrow it down further
<somlo>
but at least we know it's not liteeth (anymore) :)
<somlo>
scanakci: if you do try rocket to test your ethernet functionality right now, I recommend litex up to 4d761e1a, until the dust settles a bit more :)
rohitksingh has quit [Ping timeout: 268 seconds]
CarlFK has joined #litex
_whitelogger has joined #litex
rohitksingh has joined #litex
rohitksingh has quit [Ping timeout: 272 seconds]
tumbleweed_ has joined #litex
tumbleweed has quit [Ping timeout: 240 seconds]
rohitksingh has joined #litex
CarlFK has quit [Read error: No route to host]
CarlFK has joined #litex
rohitksingh has quit [Ping timeout: 240 seconds]
<levi>
Re: ethernet autonegotiation and auto-crossover support: This is often something configurable in the phy, and often both by pin-strapping and MDIO registers. The designers of the board I'm using for some reason just didn't bother with a useful pin-strap config so the phy comes out of reset in 10Base-T half-duplex mode. I had to add code to the drivers in the bios and firmware code to set the mdio registers back to
<levi>
reasonable values.
<levi>
Dumping the mdio registers is at least a good source of diagnostic info for what the phy thinks is going on.
tnt has left #litex [#litex]
bunnie[m] has joined #litex
nrossi has joined #litex
sajattack[m] has joined #litex
xobs has joined #litex
<_florent_>
somlo: i just built rocket on arty with ethernet and netboot is working fine here. I'm wondering if your issue could be related to the fact that CSR location have moved and you have not updated the csr.h for Linux or are not using a .dts file? If you want to use fixed CSR Locations, it's possible but you have to specify it in the SoC
<somlo>
_florent_ netboot is working perfectly for me as well. It is the linux driver that's failing: the last "working" commit is 3921b63, the first "bad" one is right after, 379d47a
<somlo>
I'd expect to see different values in csr.csv before I'd have to update my dts, but that's not the case
<somlo>
_florent_: is netboot in the bios using irq, or is it polling?
<somlo>
_florent_: for the record, during bisect I had to manually fix things like alignment, importing litedram native2axi, and cpu reset address :)
<somlo>
but aside from all of that, once the bios.bin blob is downloaded over tftp and Linux boots all the way into busybox, when I "ifconfig" my eth0 and attempt to use it from Linux, it either works, or fails with "liteeth 12003800.mac eth0: LITEETH_READER_READY timed out"
<somlo>
commit 379d47a8 looks pretty innocuous to me, but I'm much less familiar with the code than you, so I'll have to study it in more detail to see why it would impact Ethernet, of all things
<somlo>
unless it's another one of those "git bisect is really not your friend in this context" issues I've kept bumping into over the last week :D
<somlo>
_florent_: it *may* well turn out that we need to update the linux liteeth driver in some way to fit the new gateware behavior, or not... That is the question :)
<somlo>
so I'm building the latest master (18a9d4ff) with blindly reverted 379d47a on top, to see what happens
<somlo>
if that "fixes" the issue I'm having, only *then* will I attempt to engage higher brain functions and figure out *why* :)
<somlo>
(might as well make sure I'm on the right track before I potentially overheat my neurons...)
rohitksingh has joined #litex
<somlo>
_florent_: ok, reverting 379d47a (on top of latest master) *does* fix my issue
<somlo>
so I'll try zooming in on how it might subtly affect Ethernet (and what else, if anything), soon as i get back from lunch :)
<_florent_>
somlo: i'm going to look too
<sajattack[m]>
_florent_: can you look at my vga too?
<sajattack[m]>
I'll put up my modifications to litevideo as well
rohitksingh has joined #litex
<scanakci>
thanks @somlo. I just tried netboot on Rocket with an older commit than 4d761e1a. No luck. At least, the problem is not directly related to Blackparrot for now :). Looks like a network configuration issue.
<_florent_>
sajattack[m]: i can look yes, have you been able to do some test doing the dma initialization in the firmware instead of the using the linux driver? doing it in the firmware should make things easier to understand/debug
<sajattack[m]>
I looked at it briefly but didn't test it, no
<tpb>
Title: GitHub - sajattack/linux-on-litex-vexriscv at vga (at github.com)
rohitksingh has quit [Ping timeout: 246 seconds]
rohitksingh has joined #litex
<sajattack[m]>
obviously very WIP code, I'll clean it up when it's working and ready for PR
<_florent_>
the best would indeed probably to integrate the initialization code to the firmware, then try to see if the DMA is outputing data (looking at valid signal or hsync/sync for example with a scope if you have one)
<sajattack[m]>
so bring back emulator/framebuffer.c and associated code?
rohitksingh has quit [Ping timeout: 240 seconds]
<sajattack[m]>
I guess just `git revert 9a9f01baf9fa29cfa36d7e0ff0b8cfa2b60a3926`
<_florent_>
yes the commit just before the video initialization moved to the linux driver
<_florent_>
sajattack[m]: yes that's the 7-series specific part that you can remove
<_florent_>
it's here to configure the clocking
<sajattack[m]>
ok
<_florent_>
but in your case, the pixel clock is fixed and generated by the main PLL
<sajattack[m]>
got it
<scanakci>
tftp worked one time with Rocket :). I could copy a fake boot.bin to DRAM. For a reason, I cant make it work again. It is not failing, just stuck at "fetching from: UDP/69". Getting closer I guess.
<sajattack[m]>
linux gets stuck at the same spot, do I have to change something in the linux?
<sajattack[m]>
interesting, my tv is switching between "invalid format" and "no signal" now
<_florent_>
sajattack[m]: yes for now no need to boot linux
<_florent_>
just try to get a signal on your tv with the code from the firmware
<sajattack[m]>
ok
<_florent_>
when it will be working, you should see some random data from the DRAM
<sajattack[m]>
maybe it should be hsync_n not hsync? on mister it's called hsync but maybe it's secretly inverted
<_florent_>
but these random datas should be stable on your screen
<_florent_>
once you have that, you can start modifying the firmware to write some data to the DRAM and see if you see these changes on the screen
<sajattack[m]>
ok thanks
rohitksingh has joined #litex
<_florent_>
somlo: just looked at 379d47a, and that's the commit that modify the CSR mapping since SDRAM is now dynamically allocated:
<somlo>
are you compiling the dtb blob into the bios as well ?
<_florent_>
not into the bios, the dtb is provided with the linux images
<somlo>
the unpatched BBL actually expects that, I've been adding the dtb in manually as a simple matter of convenience so far
<somlo>
oh, I remember, vexriscv linux is booted in several separate blobs
<sajattack[m]>
my hsync pulse is 16.57khz
<somlo>
I think the "canonical" "Right Way (tm)" to do it would be to indeed add the dtb to the bios, since that's where all the knowledge is available and authoritative all in one place :)
<_florent_>
somlo: yes that could be a next step :)
<sajattack[m]>
should be 45 :/
<somlo>
except for 1. it makes the bios larger and 2. it's inconvenient during rapid iteration development :)
<_florent_>
sajattack[m]: ah ok, so that could explain things
<sajattack[m]>
indeed
<sajattack[m]>
is it trying to do 240p somehow?
<somlo>
_florent_: IMHO the first next step would be to add some sort of json2dts.py like thing to LiteX, so that it can generate the dts similarly to how csr.csv is generated right now
<somlo>
then it'd be a super simple makefile hack to optionally work that into the bios
<sajattack[m]>
also it's only outputting an hsync after linux has started
<somlo>
_florent_: that would make half of my out-of-tree BBL hackery disappear altogether :)
<_florent_>
somlo: yes i agree we should and a json2dts.py equivalent directly in LiteX, this one was an experiment and there are others version for Zephyr. We should have a single one integrated.
<sajattack[m]>
I checked the rate of the pll and I think it's correct
<somlo>
_florent_: allright, looks like I'm all the way back in business now :) Thanks for all your help!
* somlo
is about to find out if hacking in litesdcard support will shift csr allocations again :)
rohitksingh has quit [Ping timeout: 268 seconds]
<Xiretza>
how exactly does liteeth's wishbone endianness work? it seems to only affect SRAM data, but even that the wrong way around - and accesses to CSRs are always big-endian.
<scanakci>
@somlo: is it possible to send me the necessary file(s) that I should put into my root tftp folder to boot up linux on rocket? After trying the third switch, I can finally copy a file into DRAM using tftp. I guess I only need a bbl based on your explanation here (https://www.contrib.andrew.cmu.edu/~somlo/BTCP/)
<tpb>
Title: A Trustworthy, Free (Libre), Linux Capable, Self-Hosting 64bit RISC-V Computer (at www.contrib.andrew.cmu.edu)
<scanakci>
For a reason, the copy process fails if the file size exceeds roughly 30MB.
<sajattack[m]>
my vsync is also about a factor of 3 off
<sajattack[m]>
22Hz
<sajattack[m]>
the proportion of vsync to hsync is correct however
<_florent_>
somlo: good that it's solved!
<_florent_>
Xiretza: it's possible we could avoid the endianness change in LiteEth, but it was probably needed with the way the buffers are accessed from the software in microudp.c
<_florent_>
sajattack[m]: so you pix clk is correct? the core is configured in rgb? Can you remind me the data_width of your SDRAM?
<sajattack[m]>
I think the pixel clock is correct, triple checking it now. Not sure what you mean about rgb, sdram data is 16bits
<sajattack[m]>
yes I'm quite sure the pixel clock is correct, dividing it by 16 gave 4.5-4.8, and setting it to 75 gives roughly the same result as setting it to 74.25 in terms of hsync
<sajattack[m]>
there's no way it can't output a 75mhz clock, while there's a slight chance it couldn't output a 74.25, at least in my mind
<sajattack[m]>
it looks like it's doing rgba, if that's what you mean
<_florent_>
Xiretza: ah no it's different, it's the available bandwidth of the SDRAM controller
<somlo>
scanakci: note, this is assuming the latest-and-greatest litex, which was working all along
<_florent_>
that is not able to provide enough data for the 1280x720x32
<Xiretza>
_florent_: ooooh I see, that makes sense
<scanakci>
thank you!. One last question, what is lowest frequency that you used to test LiteETH with Rocket?
<scanakci>
I saw some prior discussion which caused some problems on LiteETH due to the CPU frequency. it was a year ago, probably there is still a lowerbound. It may be higher, though.
<somlo>
scanakci: so, the blob copies fine over tftp, and linux starts booting at cpu clocks as low as 40MHz
<sajattack[m]>
I must have made a mistake, linux is still trying to do 720p
<tpb>
Title: Netboot issues with 1Gbps and low CPU frequency (<55MHz) · Issue #30 · enjoy-digital/liteeth · GitHub (at github.com)
<somlo>
but lately the linux ethernet driver initialization (upon kernel boot) started to hang at that frequency
<scanakci>
okay, 40 MHz is great.
<somlo>
60 is my standard goto number that kinda works most of the time :)
<scanakci>
@Xiretza: 30 MHz is even better :)
<scanakci>
I see. above 50 MHz, I am getting some time violations which are hard to interpret for me due to my limited experience. I hope LiteEth works fine with 50MHz.
<_florent_>
scanakci: the theorical minimum is 125MHz/4 = 31.25MHz
<somlo>
Xiretza: if you figure out more about CSR endianness in the absence of a CPU, please share :)
<sajattack[m]>
_florent_: anythign I need to change other than emulator/framebuffer.c and pix_clk to get down to 480p?
<_florent_>
the data are read on 32-bit words in sys_clk and emitted on 8-bit words at 125MHz
<somlo>
for me, subregisters (basic CSRs) are always accessible in whatever native endianness the CPU is using (I've never built a cpu-less LiteX)
<_florent_>
before the changes, the theorical minimum was 125MHz/2 = 62.5MHz, which explains the issues below 60MHz
<somlo>
then, if compound multi-subregister CSRs are used, the subregisters themselves are always most-significant-chunk first (big-endian *like*)
<scanakci>
okay, for 60MHz I could make tftp work for Rocket. Testing lower one.
<scanakci>
This is the bbl that @somlo provided to me
<scanakci>
I will check BlackParrot as next and see if tftp works fine.
<_florent_>
sajattack[m]: no, you only need to change pix clk and firmware
<somlo>
scanakci: you need a linux* rocket variant
<scanakci>
oh totally forgot :).
<sajattack[m]>
ok, trying again, this time undef'ing the 720p rather than commenting it out
<somlo>
the "standard" rocket doesn't have a MMU
<sajattack[m]>
linux still claims it's outputting 720p
<somlo>
Xiretza: here https://github.com/enjoy-digital/litex/issues/314 is what I've been able to figure out about CSRs so far (and I should probably update the references to the code, now that LiteX updated the SoC class hierarchy)
<scanakci>
okay.
<tpb>
Title: We need to document LiteX CSRs! · Issue #314 · enjoy-digital/litex · GitHub (at github.com)
<sajattack[m]>
do I need to rebuild the buildroot?
<Xiretza>
somlo: liteeth's endianness option doesn't affect CSRs, so they're always in litex-native format (big-endian). Don't think I have any multi-chunk registers to worry about.
<somlo>
Xiretza: "litex-native" CSR format if one adds a little-endian CPU is actually little-endian :) That's what I was wondering what happens if you don't configure a CPU at all, and deal with the exposed data wires directly from the outside, over wishbone
<Xiretza>
somlo: where does this endianness change occur then?
<somlo>
hmm, that's another interesting question: if the litex-internal CPU is L.E. and sees its CSRs as also L.E., would an external CPU looking at the same stuff over a WB or AXI port see them as B.E ?
rohitksingh has joined #litex
<somlo>
Xiretza: _florent_ and I talked about this a bit a while ago, and we *think* it's in the actual wiring of data lines when CPUs are connected in litex/soc/cores/cpu/*/core.py
<Xiretza>
my observations of liteeth: let's assume there's a 1-byte register containing 0xAB at address 0x100. If I access 0x100, I get back 0x0000_00AB - the byte of interest is in the rightmost position. Now if I receive a byte 0xCD over the network, and read from the SRAM base address 0x200, I get back 0xCD00_0000 with endianness=big. To me, this seems wrong - if I want to get the *byte* at 0x200, I
<Xiretza>
suddenly have to look at the leftmost byte.
<somlo>
because if you initialize a scratch register with 0x12345678, for instance, and then read it in over a CSR bus with csr_data_width == 32, it comes back as 0x12345678 on *both* vexriscv *and* mor1k (le and be, respectively)!
<Xiretza>
right now I work around that by treating the liteeth bus as big-endian, but setting endianness=little in the YAML.
<somlo>
Xiretza: oh but SRAM endianness might be orthogonal to CSR endianness -- fun and games all around
<Xiretza>
somlo: what would be an example of a little-endian CPU in litex?
<somlo>
vexriscv
<somlo>
or Rocket, but if you simulate vexriscv is much faster :)
<somlo>
Xiretza: also recommend you try "--csr-data-width=32" to compare-and-contrast to the default (which is 8)
<somlo>
then "mr 0x82000000 0x100" from the bios prompt, to see what things look like to the CPU, from an endianness perspective
<somlo>
the scratch register is 32 bits starting at 0x82000004
<Xiretza>
somlo: honestly not too concerned with a full litex build, that's another whole can of worms - I'm just looking for a working ethernet MAC.
<somlo>
Xiretza: that's ok, just figured I'd get you interested in endianness issues, as a sanity check to what I think I know (which isn't much, and some of it is probably wrong) :D
<Xiretza>
somlo: oh I'm interested, I just wish I didn't have to be ;)
<Xiretza>
also, WRITER_LENGTH and WRITER_SLOT always come back as 0, not sure what's up with that, though very likely my fault.
<Xiretza>
heh, in a way it was! thanks for the csr_data_width suggestion, somlo, I had no idea it was 8 by default - of course reading a whole 32-bit word will only give me a single byte of the multi-part CSR in that case.
<Xiretza>
is there any real downside to having CSRs be as wide as machine words? I guess it might result in a little more logic, but runtime efficiency is way higher.
<sajattack[m]>
_florent_: what do I need to change in linux? there's no hsync until linux starts, and linux is trying to do 720p no matter what
<sajattack[m]>
ah it's the bloody json2dts I think