<_florent_>
the Kintex7 DDR Phy + ASMICON is working ;) (at least Memtest is Ok)
<_florent_>
I'm using quarter rate commands : 200MHz mem clk and 50MHz command clk
<_florent_>
Read & Write leveling are not implemented but I will probably try to add it to run at higher frequencies.
<_florent_>
I'm now trying to run the ASMICON's ports at a multiple of the command frequency (to be able to run the CPU at more than 50 MHz...)
<_florent_>
I've added some clock enable logic in the ports, but I have a question about clock domains definition with migen:
<_florent_>
- Each Asmicon port will have it's own clock domain (a multiple or not of the command frequency), let say it use "sys_clk" for now
<_florent_>
- Asmicon is using "asmicon_clk"
<_florent_>
My first attempt was to pass the clock_domain in the port parameter and use self.add_submodule(new_port, {"sys": clock_domain} in the get_port function
<_florent_>
and use self.add_submodule(self.asmicon, {"sys": "asmicon"}) in the top
<_florent_>
But it seems the clock_domain renaming in the top is also renaming the port (since I'm using sys_clk for the port)
<_florent_>
lekernel: do you have an idea on how to do it?
<lekernel>
great!
<lekernel>
so you have 1:4 serialization for the commands, right?
<lekernel>
and 1:8 for data
<_florent_>
yes like your were saying the other day
<_florent_>
it was easier in fact
<lekernel>
excellent
<_florent_>
and I'm using 4 phases
<lekernel>
with 8 bank machines?
<_florent_>
hmm, I have to check
<_florent_>
I just made a simple change in the multiplexer to support 4 phases
<lekernel>
DDR3 has 8 banks (as opposed to 4 in DDR), so I would assume you ran into that
<lekernel>
are all 8 banks working, or are you just using 4? ;)
<_florent_>
that what I have to check :)
<_florent_>
at least I've changed ba width from 2 to 3
<lekernel>
clock domain remapping in add_submodule does the remapping for the module and all its submodules, yes
<_florent_>
ok thanks
<lekernel>
and btw you can use the shorter form: {"sys": foobar} => foobar
<lekernel>
sys is implied by default
<lekernel>
I wonder if ASMI can really meet timing when you have a lot of ports... I'm bumping into issues on the slowtan6 video mixer atm
<_florent_>
for now my port won't run asmi at more than 50MHz, so it should be ok
<lekernel>
maybe I need to simplify the architecture a bit, eg just have a crossbar switch to the parallel bank machines
<lekernel>
this can also make it easier to have multiple memory controllers
<_florent_>
what is the critical path on the asmi?
<lekernel>
the hub management
<lekernel>
so I want to replace that hub with a crossbar switch, and not have split transactions anymore
<_florent_>
it can also interesting to have ports with integrated async fifo and / or bus width adaptation
<lekernel>
there can't be page hit optimization reordering anymore, too - only read/write turnaround minimization reordering, and parallel bank commands
<lekernel>
hmm, the problem is - how do you make that async fifo generic enough?
<lekernel>
how would you apply that to e.g. a framebuffer?
<lekernel>
run all the logic on the pixel clock?
<lekernel>
with just two async fifos into the system clock domain to send memory read commands and get the results?
<_florent_>
the idea was just to be able to have ports with different frequency than the asmicon
<_florent_>
instead of having the fifo in the framebuffer as it is now, having running @ pixel_clk
<lekernel>
different frequencies = more latency, more chances for non deterministic bugs that maximize time wastage, simulation difficulties that maximize time wastage even more
<lekernel>
and how does the framebuffer communicate with the cpu?
<lekernel>
to set scan address, video timing parameters, etc.
<_florent_>
yes on this point you have to resynchronize all signals
<lekernel>
have some clock domain transfer support in csrgen?
<_florent_>
why not ;)
<lekernel>
yeah, could work...
<lekernel>
but
<lekernel>
let's say we want 1080p
<lekernel>
then we have to run relatively large amounts of logic at 148MHz, which is, as I know so well, a royal pain in slowtan6
<_florent_>
yes but the framebuffer is maybe not a good example for that
<_florent_>
if you have 1 Asmicon + N totally independent cores that need memory accesses
<lekernel>
I'd try to run everything on one single clock. minimizes memory latency and headaches.
<_florent_>
my idea was that it's easier a generic port that can run at the core frequency instead of doing all clock domain crossing directly into each core
<lekernel>
why do all those cores need different clock domains?
<larsc>
because they can ;)
<_florent_>
;)
<_florent_>
Imagine you have video multiplexer, 2 SD inputs, 2 HD inputs, 2 SD outputs, 2 HD outputs, you want to be able to redirect each SD input to each SD output, same for HD, I find it easier to have async port than have to handle clk domain transfer in each port
proppy has quit [Remote host closed the connection]
<_florent_>
but anyway, for now I only want to be able to run the CPU at more than 50 MHz
<lekernel>
I'd rather implement read/write leveling than waste time on hacking asynchronous ASMI ports
<lekernel>
WL should be easy if Xilinx got the calibrated IODELAYS right in the 7 series
<lekernel>
for RL you just need to use DQS for reading
<lekernel>
I recommend you do a small soft FIFO that can store two bursts (ie 16 bits deep)
<lekernel>
then for data recapture just read the FIFO with the worst-case delay
<lekernel>
(read in the system clock domain)
<lekernel>
there's only one annoying detail, you won't do a FIFO with DDR registers
<lekernel>
so you need a IDDR
<lekernel>
and the last data pair will get stuck in it
<lekernel>
to solve this I propose the controller issues one dummy reads whenever there is a "bubble" in the read flow, to make the DDR toggle DQS and clock the data out of the IDDR and into the FIFO
<lekernel>
the dummy read is easy, just repeat the last read command immediately - it's guaranteed to be a page hit that will produce a continued burst
<_florent_>
ok thanks, I remember we discussed about that before, now that I have something working on board, it will be easier to work on it
lekernel has quit [Ping timeout: 256 seconds]
bhamilton has left #milkymist [#milkymist]
proppy has joined #milkymist
aeris has quit [Read error: Connection reset by peer]
aeris has joined #milkymist
lekernel has joined #milkymist
<lekernel>
)=(//(! thunderstorm
Alarm_ has joined #milkymist
<lekernel>
I have Pearson correlation coefficients of 0.56, 0.53 and 0.78 between wer0/wer1, wer1/wer2 and wer2/wer0
<lekernel>
7K samples
<Alarm_>
I do not see Xiang fu on IRC and he does not respond by email?
<larsc>
lekernel: what are werX?
<lekernel>
number of noncontrol words with too many transitions received during the last 2**24 words
<lekernel>
X = channel number
<larsc>
ah
<lekernel>
could be clock skew/glitches/failure?
<larsc>
Alarm_: <@qi-bot> larsc, xiangfu (~xiangfu@123.113.243.136) was last seen quitting #qi-hardware 12 hours 54 minutes ago
<lekernel>
s/skew/jitter
<Alarm_>
larsc: OK thanks
<larsc>
lekernel: so this means there is quite a bit of correlation, right?
<lekernel>
I'm not a statistics expert, but I'd think so
bhamilton has joined #milkymist
bhamilton has quit [Quit: Leaving.]
<lekernel>
lol: removing the DCM_CLKGEN, which supposedly provides better clock jitter tolerance than the PLL, results in WER=0 and no more picture noise
<lekernel>
guess I just have to sort out the memory speed issues now, and the video mixer will be perfect :)
<wpwrak>
just don't use anything xilinx recommend :)
<wpwrak>
regarding the correlation, we already know there's a strong long-term correlation (the temperature dependency)
<lekernel>
that was correlation between the error rates on each channel, which suggested a clock problem
<lekernel>
or some other source of noise that would affect them all at the same time
<wpwrak>
yes. the temperature issue showed that too (without explaining the underlying problem, though)
<lekernel>
oh, I can already do 720p at WER < 5 :)
<lekernel>
1280x720
<lekernel>
not bad for that add on board
<wpwrak>
add a negative DCM_CLKGEN and it'll be perfect ;-)
<wpwrak>
indeed. you're way above any frequency such a contraption can reasonably be expected to handle
bhamilton has joined #milkymist
Alarm_ has quit [Quit: ChatZilla 0.9.90 [Firefox 21.0/20130511120803]]
<lekernel>
hmm, perhaps I can even have 1080p24 on the inputs - which is a HDMI standard - and 1080p60 at the output
<lekernel>
that's a bit more than 8Gbps memory bandwidth, challenging but maybe doable
<lekernel>
so I guess next step is to fix ASMI
<lekernel>
ah, no it's 12Gbps bandwidth. won't work :(
<lekernel>
maybe if I output 1080p30 or 24 - don't know if monitors accept it from VGA ...