_florent_ changed the topic of #litex to: LiteX FPGA SoC builder and Cores / Github : https://github.com/enjoy-digital, https://github.com/litex-hub / Logs: https://freenode.irclog.whitequark.org/litex
tpb has quit [Remote host closed the connection]
tpb has joined #litex
lf has quit [Ping timeout: 260 seconds]
lf has joined #litex
FFY00 has quit [Ping timeout: 260 seconds]
FFY00 has joined #litex
Degi_ has joined #litex
Degi has quit [Ping timeout: 272 seconds]
Degi_ is now known as Degi
CarlFK has joined #litex
key2 has quit [Read error: Connection reset by peer]
key2 has joined #litex
Bertl_oO is now known as Bertl_zZ
futarisIRCcloud has joined #litex
cr1901_modern has quit [Quit: Leaving.]
cr1901_modern has joined #litex
CarlFK has quit [Ping timeout: 256 seconds]
CarlFK has joined #litex
CarlFK has quit [Read error: Connection timed out]
FFY00 has quit [Ping timeout: 260 seconds]
futarisIRCcloud has quit [Quit: Connection closed for inactivity]
hansfbaier has joined #litex
lkcl has quit [Ping timeout: 272 seconds]
lkcl has joined #litex
hansfbaier has quit [Read error: Connection reset by peer]
hansfbaier has joined #litex
shorne has joined #litex
<shorne> Hello, I am looking at the linux kernel litex_mmc driver, https://github.com/litex-hub/linux/blob/litex-rocket-rebase/drivers/mmc/host/litex_mmc.c#L281
<shorne> I noticed the performance on mor1kx arty is slow ~1.6mb/sec
<shorne> I notice for the dma transfer we use: sg_copy_to_buffer(), which will slow things down, the alternative is to use dma channel's
<shorne> I haven't implemented this before but I will look into to see how it will help, just want to mention it to get any feedback
<shorne> I am not sure if there is a "real" dma engine with multiple programmable channels that can handle scatter gather
<_florent_> Hi
<_florent_> nickoe: would you mind creating an issue in litex with your notes/questions? this will be easier to answer
<_florent_> shorne: we don't currently have scatter gather for the SDCard's DMA
<_florent_> but if you think this is useful, I could have a look. It seems that currently the SDCard is a lot slower in Linux than with the BIOS, we should probably try to understand that first since the limitations seems to come from the software.
Bertl_zZ is now known as Bertl
lkcl has quit [Ping timeout: 264 seconds]
lkcl has joined #litex
hansfbaier has quit [Read error: Connection reset by peer]
<somlo> _florent_, shorne: after having stared at the Linux litesdcard driver for a loooong time, I can't find anything it's doing significantly different from what the LiteX bios does in terms of accessing the card, other than maybe the timing of the requests (although even disabling interrupts on a single-core soc around litex_request() doesn't seem to make a difference)
<somlo> likely I'm missing something, but the litesdcard FSMs are timing out left and right when driven by the Linux driver, and seem happy and healthy when driven by the bios...
mikeK_de1soc has joined #litex
FFY00 has joined #litex
<somlo> _florent_, shorne: https://github.com/enjoy-digital/litex/pull/820 should ensure that 1. we never timeout during sdcardboot (even on weird CPU configurations such as Rocket :) and 2. don't run into command timeouts with the LiteSDCard FSMs in single-block (cmd17-only) mode on the Linux driver
<somlo> This should help stabilize the current linux driver for now at no extra penalty for any other sdcard use (larger timeout values don't actually slow down the sdcard FSMs when things work *well*, they just avoid unnecessary errors and retries in linux)
<somlo> I'm still trying to get to the bottom of why enabling cmd18 multi-block breaks horribly even with the larger timeouts
<somlo> and will open a separate issue once I have my ducks in a row (have to remember everything I tried and write down a coherent report)
<_florent_> thanks @somlo, the PR looks fine, I'm going to merge it. For the issue with the Linux driver, I could do some capture of the SDCard signals with an external logic analyzer to try to understand the difference between the BIOS and Linux driver.
<somlo> _florent_: thanks, that could be helpful! I've been studying the migen sources (and doing remedial learning on streams, as you are aware)
<somlo> I'm slow almost by definition -- anything I need to touch I need to learn from scratch, first :)
<_florent_> That's a good approach that I also generally try to apply when time allows it :) (and your feedback while learning is very valuable).
<_florent_> somlo: when in Linux, can you tell me how you reproduce the timeouts easily? I could try on Arty with Linux-on-LiteX-VexRiscv
<somlo> change max_blk_count to 2
<somlo> that will enable cmd-18 for reads
<somlo> I tried to document my understanding of how the mmc subsystem figures out everything else around that setting in the surrounding comments
<somlo> scatter/gather is off by default (as it should be, for now)
futarisIRCcloud has joined #litex
<tmbinc> _florent_: for programming the sds1104xe.bit, is setting the scope's jumper to "JTAG", then using openocd's "zynqpl_program"/"pld load" the right approach? (I've always used Xilinx tools before on Zynq)
<nickoe> _florent_: Ok, I can try to do that.
<mithro> _florent_: Do the Ibex pythondata packages look okay? https://github.com/enjoy-digital/litex/issues/695 ?
shoragan has joined #litex
mikeK_de1soc has quit [Quit: Connection closed]
Bertl is now known as Bertl_oO
mikeK_de1soc has joined #litex
<mikeK_de1soc> _florent_: I was just wondering, Is the LiteVideo Currently working? I would like to use it for te DE1-SoC board. Thanks!
<shorne> _florent_: I think the reason the sdcard is slower in linux is because of sg_copy_to_buffer(), I can't easily prove it now, I'llhave to think thow to profile the code in the driver.
<shorne> In the bios: https://github.com/enjoy-digital/litex/blob/master/litex/soc/software/liblitesdcard/sdcard.c#L503-L510, the read/write can transfer the user buffer directly to the sdcard
<shorne> to/from the sdcard
<shorne> in linux the driver gets a bunch of blocks in the scatter gather list, it then has to, in software, copy those into a dma buffer (with sg_copy_to_buffer) before sending to the sdcard (similar for reading).
<shorne> that extra copy slows it down. is my assumption. With a dma engine the hardware could handle the sg lists directly via queuing multiple small dma transactions.
<shorne> Thats what I gather from my investigatiion last mogjt reading the kernel code and some dma controller specs
<_florent_> tmbinc: if you generated the bistream with the target from litex-boards, you can also use --load (it uses VivadoProgrammer)
<_florent_> tmbinc: I could provide you some test bistreams if you want to check your hardware (tomorrow)
<_florent_> mithro: thanks for the pythondata repos, I'll have a closer look tomorrow and do the integration
<_florent_> shorne: thanks interesting, do you know what's the general queue depth of usual scatter gather for similar cases? Implementing it should not be too complicated if you think this can be useful (just adding a FIFO for the queue)
<shorne> _florent_: I am trying to figure that out too at the moment, I know they talk about having 32/64 channels, but I think that is for each device connected, the queue depth is separate
<mikeK_de1soc> _florent_: Do you have an example of how to implement the liteVideo submodule?
<tmbinc> _florent_: thanks - I've (now) noticed --load, and I'll check what Vivado does differently than my openocd setup. In neither case I could get UDP traffic working, maybe I should route the serial port to some physical pins first
<tmbinc> (I didn't use --load but used hardware manager manually since I need a Xilinx Virtual Cable setup)
<shorne> _florent_: it looks like some dma engines the command queue is even 1, and the queue is maintained per channel in the kernel side (readin r-car dma controller driver).
<shorne> I can't find docs for that, I found docs for this https://www.nxp.com/docs/en/application-note/AN4522.pdf
<somlo> _florent_, shorne: I think it's a bit worse than just lack of SG support -- the command timeout happens here in the litesdcard core: https://github.com/enjoy-digital/litesdcard/blob/master/litesdcard/core.py#L142
<somlo> which is before any data transfer comes into play
<shorne> I will read more, maybe I can try programming a 1-channel dma engine for sdcard
<shorne> I see
<shorne> I didn't notice the timeouts
<shorne> I will see if that is happening for me too
<shorne> Got to go
<somlo> so while I agree SG will probably make it faster in the end, I'd like to understand why simply increasing the command, data, (and sure, DMA) timeouts in the linux driver to as huge as we can *still* won't make the errors go away :)
<somlo> when I enable >1 max_blk_count in `probe`
<somlo> shorne: (for when you get back :) -- I think there's two problems: performance in linux, and an actual bona-fide *bug*
<somlo> my first instinct to make it go faster was to allow multi-block transfers (e.g., cmd-18 for reads) by increasing max_blk_count
<somlo> which *should* work, but triggers the weird timeouts I've been running into (i.e., the bug) :)
<somlo> I think SG should help, but IMHO we should get the linux driver to work with multi-block transfers (18 for reads, 25 for writes)
<somlo> * first :)
<somlo> and also IRQ support :)
<_florent_> mikeK_de1soc: here is an example on Xilinx FPGAs: https://github.com/litex-hub/linux-on-litex-vexriscv/blob/master/soc_linux.py#L148-L180
<mikeK_de1soc> _florent_: Many Thanks!
<_florent_> you can find other examples in litex-buildenv, in the video targets: https://github.com/timvideos/litex-buildenv/blob/master/targets/atlys/video.py
<mikeK_de1soc> OK great!  BTW got the DE1-SoC Led chaser to work! https://youtu.be/bgpP9cWTnJQ
<_florent_> tmbinc: were you pinging the board at 192.168.1.50? I could generate/test a bistream and then send if to you.
<_florent_> somlo: yes sure, we first need to get rid of the timeout issue before looking for performance
<mikeK_de1soc> _florent_: tim's buildenv, does not seem to have the Altera DE1 or DE10 boards in his repo, is it worth starting?
<tmbinc> _florent_: Got it working - Ethernet only started working after I unplug it once, both Vivado->xvcd as well as openocd work for programming
<_florent_> tmbinc: do you know/remember if there was something specific to be able to use the LCD of the sds1104xe? I remember doing a very quick test last year but haven't got it working (also haven't spend too much time understanding), is is just vga? with the pins from https://github.com/360nosc0pe/siglent_hardware/blob/master/sds1104xe/fpga_pinout.txt#L13-L48
<tmbinc> _florent_: I'll port over the stuff I had (will probably look rough :) regarding the DAC analog MUX and the other components to configure the frontend. I don't remember exactly but I remember we've been using PS SPI for PLL setup, I need to check if this can be done from PL side as well
<tmbinc> _florent_: LCD was done by G33KatWork or q3k, I need to check
<tmbinc> (it already worked when I started on it)
<_florent_> tmbinc: ok good you got it working. So now basically you'll be able to control the SoC over UDP with litex_server
<tmbinc> yep - very nice! (Never worked on an ethernet-enabled litex device, so this is more friendly than I imagined :)
<_florent_> just need to start the server with litex_server --udp and then execute your scripts to control your DAC registers
<tmbinc> Thanks
<_florent_> if you need to look at signals, you can also use Litescope over the bridge: https://github.com/enjoy-digital/litex/wiki/Use-LiteScope-To-Debug-A-SoC
<mikeK_de1soc> MAn,, that is insane!
<mikeK_de1soc> Drool..... I really want to get my DE1-SoC Video working...  still learning...
<_florent_> tmbinc: looking at your code, I think you should be able to integrate it without modification (or maybe only minors)
<_florent_> in your SoC:
<_florent_> from offsetdac import OffsetDac
<_florent_> self.submodules.offset_dac = OffsetDac()
<tmbinc> Is AutoCSR still a thing? My migen knowledge is... from 2014
<_florent_> self.add_csr("offset_dac")
<_florent_> should do it
<_florent_> yes
<_florent_> I have to go, I could help more tomorrow if you have trouble integrating your old code
<tmbinc> My duty cycle working on this will be small unfortunately but I'll see how far I get
<tmbinc> Thanks for your help already!