<kgugala>
CarlFK I pasted a link where you can find prebuilt toolchains
CarlFK has joined #litex
<CarlFK>
kgugala: 1. Connect to NeTV2 board using JTAG (ARM-USB-TINY-H JTAG adapter was used)
<CarlFK>
I don't have a jtag anything, but I do have a pi and some jumpers
<kgugala>
that should also work
<kgugala>
this step is for programming the bitstream
<kgugala>
if you use rpi, skip the make gateware/reload step
<kgugala>
this target is for programming FPGA from host PC via jtag
<CarlFK>
do I need to hook up to the jtag headers on the netv2, or is the 20 pin connector do that too?
<kgugala>
I never used rpi with netv2
<kgugala>
I suppose the 20 pin connector is rpi format
<kgugala>
if you have Xilinx platform cable you can connect it via jtag header
<CarlFK>
I don't have that either
awordnot has quit [Ping timeout: 264 seconds]
awordnot has joined #litex
rohitksingh has joined #litex
kgugala__ has joined #litex
rohitksingh has quit [Ping timeout: 244 seconds]
kgugala97 has joined #litex
kgugala has quit [Ping timeout: 256 seconds]
kgugala__ has quit [Ping timeout: 265 seconds]
rohitksingh has joined #litex
kgugala97 is now known as kgugala
<bunnie>
The 20-pin connector is designed to plug into an RPi so you can use openOCD running on the Pi to talk directly to the FPGA. It also maps the UART to the RPI's UART
<bunnie>
Just make sure you plug in the RPi aligned to the board correctly. A few people have been off by one row and well, it didn't end well for their hardware.
kgugala has quit [Read error: Connection reset by peer]
kgugala has joined #litex
<benh>
Folks that have a long FPGA experience around here... one thing that's been bugging me with microwatt on Arty for a while...
<benh>
when building with litedram (it *looks* like it's only happening in that case so far, ie, clock comes from litedram's PLL), when starting up
<benh>
the messages out of the UART are garbled for a few dozen/hundred characters then are ok
<benh>
if I hold the core stopped for a second or so at reset then start it, the problem goes away
<benh>
everything is fine in sim
<benh>
if I make the core do a loop of a few dozen thousands of dummy reads from the UART status before printing anything out it's ok
<benh>
Paul scoped the UART output and the bit duration looks fine
<benh>
this has been eluding us for ages
<benh>
(it's not a LiteX UART, some simple "potato" uart we picked up ages ago, we'll replace it eventually, but it seems ok, I fixed a bug or two in there but nothingg so far that had any impact on that phenomenon
<kgugala>
benh looks like core's reset is reasserted before the pll is locked
<kgugala>
*deasserted
<benh>
kgugala: the core reset comes from the soc reset which comes from the reset controller which doesn't even start countingg until the pll_locked signal is 1
<benh>
kgugala: but maybe we have an obscure bug in there
<kgugala>
could be
<benh>
kgugala: but yeahm that was my first reaction too...
<benh>
I've never managed to use chipscope successfully but I can try routing those signals to pins and use an actual scope
<benh>
I can probably borrow one and find a crappy uSB one somewhere
<kgugala>
the other option is that the pll is reset incorrectly. AFAIK pll has some strict reset routine
<kgugala>
I mean Xilinx 7 series pll
<benh>
Ok. I wonder...
<benh>
_florent_: I notice arty.py creates one S7PLL
<benh>
_florent_: with all the clocks out of it, including iodelay via an S7DELAYCTL
<benh>
_florent_: however, litedram_gen creates 2 PLLs, one sys_pll and one just for iodelay
<benh>
(without specifying a speed grade for the second one)
<benh>
Now .. I dont' know that much about Artix PLLs, but I wouldnt' mind knowing if there's a rationale for this ;-)
<_florent_>
benh: i'll fix the missing speedgrade on the second PLL
<benh>
_florent_: so why two ?
<benh>
_florent_: the LiteX standard arty.py seems to create only one ...
<benh>
also, am I getting lost in migen python or is LiteX not actually using pll.locked ?
<_florent_>
benh: a second PLL is used in LiteDRAM to allow more frequency steps (since it's difficult to meet generate both sys_clk/iodelay_clk from a single PLL)
<benh>
(note that I didn't see a problem with the LiteX generated microwatt, only with the standalone one + litedram, so I'm looking at differences)
<benh>
_florent_: ok but on the limited "scope" of an Arty, a single is enough then ?
<_florent_>
benh: yes, and we could eventually try to use only one if we are able to generate a working configuration with only 1 PPLL
<benh>
_florent_: ok, not a big deal for me... unless you think that could be behind some of my weird issues above ...
<benh>
_florent_: iodelay is only used internally to litedram right ?
<benh>
so why is it ok for LiteX to not wait for pll.locked before lifting reset ?
<_florent_>
benh: if you are using IODELAY primitives, you need to have at least one IDELAYCTRL in the design
<benh>
_florent_: ok, I think litedram is the only one that does in my current design
<benh>
_florent_: so that's probably not related to that weird issue
<_florent_>
benh: if you share a project i can build easily, i could investigate a bit
<benh>
_florent_: I would love that but beware, it's microwatt fusesoc project in vhdl :-)
<benh>
_florent_: let me try to investigate a bit more first, esp. since you probably wont' be able to regenerate litedram on it with its current sccripts until we finalize our current work and I update microwatt to match it upstream
<benh>
_florent_: but I'll definitely take your offer if I draw a blank in the next few days :)
<_florent_>
benh: ok, it's easier for me to investigate if i can just have an archive with the sources and small script to build the design.
<benh>
_florent_: yeah ... "small script" means install/run fusesoc sadly
<benh>
_florent_: at least for now ... though that's not hard to pip install it
<benh>
and fusesoc generates a xilinx prj
<benh>
anyway, I'll dig a bit more.
<benh>
_florent_: the user_reset that comes out of litedram standalone is what I feed as reset to the rest of my loggic
<benh>
_florent_: does it wait for pll_locked already ?
<benh>
hrm ... actually I don't .. I must have hacked that a while ago... I use pll_locked and feed that into the reset controller
<benh>
but I use it as a sync signal... I dont' have synchronizers there, I sample pll_locked on a sys_clk edgge
<benh>
maybe that's wrong...
<benh>
should I treat pll_locked as asynchronous ?
<_florent_>
pll_locked is currently asynchronous yes, but i could make it synchronous to the sys_clk
<_florent_>
benh: you could also use user_rst that is synchronous to user_clk
<benh>
_florent_: is user_rst guaranteed to be only deasserted until after pll is locked ?
<benh>
I'm building a test with synchronizers on pll_locked see if that makes a difference
<_florent_>
user_rst is only deasserted when the pll is locked yes
<benh>
we have this crappy reset controller someone wrote (I forgot whome, maybe anton) which uses a counter to delay reset release
<benh>
but it doesn't have synchronizers on the main reset and pll_locked inputs
<benh>
despite being a synchronous circuit
<benh>
maybe it's going a bit nuts
<benh>
_florent_: so I could just use user_rst as a clean synchronous source of reset then, great
<benh>
Hrm...
<benh>
took out our custom reset controller and just used user_rst out of litedram as the SoC reset (core reset delayed 64K clocks), and the problem still occurs
<benh>
fun .. :-)
<benh>
I'm even wondering whether there's a voltage drop when everything comes up... I'm powering the Arty off USB
<benh>
I should try an external psu at some point
<benh>
anyway, dinner time :)
<_florent_>
benh: is the behavior different at the first startup after loading the bitstream and with a manual reset?
scanakci has quit [Quit: Connection closed for inactivity]
<benh>
_florent_: we don't have a manual reset but I'll add one and test
<benh>
_florent_: we do have a manual soc/core reset but that doesn't reset litedram (somewhat on purpose)
<benh>
I'll try something later, let me first test what you committed and the system.h include on csr.h, I want to get that stuff done and dusted :)
<benh>
_florent_: is that safe to assume you'll eventually merge my rework-csr-accessors ?
<benh>
or rather csr-access-rework :)
<benh>
_florent_: so the whole inclusion of system.h gets a bit messy in the generated csr.h
<benh>
_florent_: my thinking is that in gen_csr_header, if with_access_functions, I'll just unconditionally hw/common.h, which itself will include system.h
<benh>
_florent_: with my rework, hw/common.h will do the right thing cs. CSR_ACCESSORS_DEFINE
<benh>
_florent_: which is to define all the "new" fancy high level ones based on the simple ones and leave the simple one to the platofrm
<_florent_>
benh: i'm not the best qualified to evaluate theses changes, so if somlo, xobs are happy with it, i'm fine merging it
<xobs>
Sure, okay by me.
<benh>
_florent_: wait
<benh>
_florent_: let me merge the system.h addition into that patch
<benh>
and fix some leftover commented out code that was in there while at it
<benh>
xobs: somlo: Please re-check that csr-access-rework
<benh>
I've folded in the suggestion of including system.h to give the platform/cpu a chance to override CSR_BASE and CSR_ACCESSORS_DEFINED and provide inline simple accessors
<benh>
I'm hoping it won't break anything but I would appreciate your eyes (and possibly testing)
<benh>
_florent_: if that passes muster I think that's all I need for standalone litedram on microwatt, we're good
<benh>
_florent_: on another note...
<benh>
paul noticed that loading a cache line from litedram seems to take
<benh>
about 15 cycles for the first read and then about 12 cycles per read (64-bit)
<benh>
at the moment, my wb<->litedram bridge is a bit dumb. Each 64-bit read is a complete litedram cycle where I use either the top or bottom half of the data
<benh>
now I haven't looked too closely at how I could pipeline/stream the user port there ... would it be possible to send a single read command and then pump data out of the read port multiple times ?
<benh>
or it's just the speed I should expect and the best I can get might be to cache the other 64-bit of data coming in for the next access ?
<benh>
I've tried but I've found myself so far unable to understand the Hw design from reading the python mygen stuff :) It's ... very hard to parse for someone not experienced
<benh>
ie. do you need a CMD phase for each transfer ?
<benh>
is 12 cycles per transfer of 128 bytes something expected ?
<benh>
or can I pipeline N commands and separately do N data cycles to get the data ?
<benh>
(if yes, how much is N practically speaking ?)
<benh>
xobs: somlo: doing some more changes to that branch... I don't like that csr.h includes hw/common.h... it might make things harder for your etherbone cases...
<benh>
xobs: somlo: done. Pls check (and see my response on github)
<benh>
_florent_: I think I get the gist of it ... with pipelined wb I should be able to turn each read or write on the wb into a "command" to the native port
<benh>
_florent_: and separately handle the data... let's assume I keep write simple for now, ie, I send a write command and write data together when both ports are ready
<benh>
_florent_: that mean for reads, I can send commands on each 'stb' from pipelined wb
<benh>
_florent_: and return data+ack on each valid i get on the read port completely independently
<benh>
_florent_: right ?
<benh>
_florent_: now, my wb is 64bit wide, the native port is 128... is there a gain in caching the last read data/address in my wrapper
<benh>
_florent_: to return the "other" half since that's typically the next thing happening on a cache line refill
<benh>
_florent_: or I may as well send another read command down to litedram ?
<benh>
if you don't have time to respond, I'll try this week-end to build some kind of test setup in verilog with a self-initializing litedram and a micron DDR model or something like that
<benh>
to experiment with
Dolu has joined #litex
<_florent_>
benh: the 12cycles per transfer if because of the latency, when pipelining and in the ideal case (sequential accesses), you should be able to nearly write/read 128 bytes/cycle (at least with a DMA).
<_florent_>
benh: we are currently working on an adapter for the LiteDRAM native interface that will be able to do the data width adaptation for both reads and writes
<_florent_>
so you will be able to request a 64-bit port directly from LiteDRAM and data width convertion will be handled internally
<benh>
_florent_: ok. In the meantime, is my undertanding of how the port work correct ?
<benh>
ie, setting aside the width issue
<benh>
_florent_: and ignoring writes for which I'll, for now, just wait for both cmd and write port to be ready as today
<benh>
_florent_: for reads, I can send all the commands as I get the stb's from the pipelined wb
<benh>
_florent_: and separately retrieve the data & send data & acks to the wb as I get the valids from the read port ?
<benh>
_florent_: ie, is my undertanding correct that I always need a command per access, but I can pipeline a bunch of commands and do the data transfers from the read/write ports separately ?
<benh>
(in the order the commands were done of course, which gets messy if I mix up reads and writes but is nicely suitable for a string of reads such as a cache refill)
<_florent_>
yes it's correct, but there is no data buffering with the native interface, so that's possible you'll have to add Write/Read FIFOs
<_florent_>
for writes, once the command is accepted, write the data to the FIFO and LiteDRAM will request it when it will be able to tranfer it on the physical interface
<_florent_>
for reads, LiteDRAM is not handling the ready of the rdata stream (which would require buffering), so the data should be accepted when valid is set to 1
<benh>
_florent_: ok so as a first step, writes are simple as I have the command and data as one "stb' on the wb, I can just wait for both ready to be 1 before I send it
<benh>
_florent_: (for now)
<benh>
_florent_: for reads, I can arrange to always be ready.. the way pipeline wb works, I am in a cyc=1 cycle, I can always send ack+data back if I had commands
<benh>
_florent_: the master won't send commands if it can't handle the data
<benh>
_florent_: the only issue is the 64-bit vs. 128-bit but I can trivially handle that with a small latch until you have that sorted on your side
<benh>
_florent_: improving writes might require a fifo indeed
<_florent_>
benh: it will probably not work for writes, since the command will be acked before the write, so you need to be sure you won't send additional commands while waiting for the write to be acked.
<benh>
_florent_: right, that's what I'm doing now aleady, it's not fast but works
<benh>
_florent_: my plan is to eventually turn our BRAM into an L2 (with an option to keep it linearly mapped so we can still use it as boot "firmware")
<benh>
_florent_: at which point I'll probably pipeline both reads and writes as whole cachelines with a direct 128-byte bus between litedram and L2
<benh>
but right now it's more than my spare time can cope with, maybe in the next few weeks...
<benh>
first, once you've merged all that csr gunk and I've pushed the microwatt updates to anton, I want to gggo back to baby sitting your LiteX/microwatt intgration
<benh>
see if I can get the external interrupt going
<benh>
etc...
<_florent_>
that would be nice, i should spend time finishing the ghdl-synth/verilator support to ease this work (this would also help for litedram integration)
<benh>
definitely
<Finde>
benh: has anyone tried running the microwatt RTL through commercial tools? I tried to use vhdlan from vcs yesterday and it was really complaining a lot
<somlo>
benh, _florent_: fwiw, commit 1e35b0e7 (still) works OK for me :)
scanakci has joined #litex
shuffle2 has quit [Quit: WeeChat 2.6]
acathla has quit [Quit: segfault]
lambda has quit [Quit: WeeChat 2.8]
lambda has joined #litex
Skip has joined #litex
acathla has joined #litex
Skip has quit [Remote host closed the connection]
Dolu has quit [Quit: Leaving]
kgugala__ has joined #litex
kgugala has quit [Ping timeout: 256 seconds]
Skip has joined #litex
_franck_8 has joined #litex
_franck_8 has quit [Client Quit]
_franck_ has quit [Ping timeout: 265 seconds]
fjullien has joined #litex
fjullien is now known as _franck_
acathla has joined #litex
acathla has quit [Changing host]
<CarlFK>
bunnie: thanks.
kgugala has joined #litex
kgugala__ has quit [Ping timeout: 265 seconds]
darren099 has quit [Quit: Leaving]
Skip has quit [Remote host closed the connection]
<benh>
Finde: Vivado for sure ;-) I think Mikey tried at some point a commercial tool for verilog conversion and it was .. painful
<benh>
Finde: it looks like VHDL 2008 support in tools is somewhat flawky
<benh>
somlo: great thanks !
<CarlFK>
kgugala: http://paste.ubuntu.com/p/bV2nB8mysr/ application-specific initialization failed: couldn't load file "librdi_commontasks.so": libtinfo.so.5: cannot open shared object file: No such file or directory