<promach3>
For DDR memory, what are the differences between WRAP, WRAPS4, WRAPS8 ?
<mwk>
burst length
<mwk>
this looks like DDR3
<mwk>
DDR3 is a 4n prefetch architecture, ie. you basically have to read/write stuff in units of 8 transfers (4 ×2 because of DDR)
<mwk>
DDR is 1n (2 transfers), DDR2 is 2n (4 transfers)
<mwk>
but, 8 transfers is of pretty unwieldy size, and often you don't want this much data
<mwk>
you cannot do a smaller transfer internally because of how the device is built, so there always have to be at least 4 cycles between subsequent transfers for a single device
<mwk>
but, the long burst is annoying not only because it limits a single device, it also limits the whole bus
<promach3>
mwk: 8 transfers are of how many bits ?
<mwk>
so DDR3 adds an option of "burst chop to 4" where only 4 bits are transferred, and the remainder are implicit
<mwk>
the advantage of this is that it doesn't take up the bus so you can turn it around to another memory device quicker, or do a write-to-read / read-to-write turn quicker
<promach3>
what do you mean by implicit ??
<mwk>
promach3: of however many bits of width your memory bus has
<mwk>
the exact width isn't prescribed by DDR spec
<mwk>
if you're using a single module, it's going to be 4, 8, or 16 bits per transfer
<mwk>
and you can tell by looking at number of DQ pins in the pinout
<promach3>
huh ? the webpage link said Width: X16
<mwk>
ah, good
<mwk>
then it's 16 bits per transfer
<promach3>
ok
<mwk>
as for implicit
<mwk>
this means that the chopped 4 transfers don't actually happen on the bus, but are just handled internally to the device and so still take up time (and you cannot do other transfers using the same device until it is done)
<promach3>
Why would some user waste such 4 transfers when it does not free the bus ?
<mwk>
it does free the bus for
<mwk>
it does free the bus for other devices, or for a bus turnaround
<promach3>
ok, What do I need to pay attention to for the DDR3 memory controller state diagram transition ?
<mwk>
... you could start by saying what exactly are you doing
<mwk>
writing a DDR3 controller?
<promach3>
yes, I am writing a DDR3 memory controller
<mwk>
well that's going to take some time
<promach3>
I have time
<mwk>
and the answer is: all of them, of course
<mwk>
I mean you can skip the power-down states if you don't need them
<mwk>
maybe even write leveling and other calibration if you intend to use your module way below its specced max clock
<promach3>
What is the purpose of write levelling ?
<mwk>
but in general, making a DDRx memory controller is a long ordeal
<mwk>
the bigger the number after "DDR", the longer, exponentially
<mwk>
you know what signal skew is, right?
<promach3>
clock skew ?
<mwk>
that too
<mwk>
except it's a different thing
<mwk>
okay, seriously, writing DDR controllers is *hard* and you'll get into all kinds of annoying physics laws that come up when signals get fast enough, plus you get to track a crapload of timings relating to the memory so that you don't accidentally send an invalid command
<mwk>
so... yeah, signal skew
<promach3>
it seems difficult
<mwk>
the problem is that your signals on your PCB, even if you transmit them at the same time, don't actually get to the destination device at the exact same moment
<mwk>
this effect doesn't matter for slow signals all that much so it tends to be ignored, but with DDR3 data rates it's very real and very annoying
<promach3>
mwk: I guess orangecrab does not have this issue
<mwk>
do you know what DQS is?
<promach3>
strobe
<mwk>
right
<mwk>
basically DDR has given up on using just one common clock for all signals
<mwk>
instead there are many
<mwk>
nominally in sync, but they differ in delays
<mwk>
everything in DDR is source-synchronous; the command+address bus is synchronous to the actual main CLK, but the data pins are synchronous to their associated DQS
<mwk>
and DQS is driven by whatever chip is currently driving data
<mwk>
write leveling is the process of calibrating the per-pin delays in the controller so that everything that comes out of the controller (CLK, DQS, DQ) is aligned to each other properly when it reaches memory
<mwk>
it's called "write" leveling because it affects the write path: DQ from controller to memory
<mwk>
there is likewise read leveling, which is calibration of *input* delays in the controller so that reads work right
<mwk>
write leveling is specifically CLK/DQS alignment; you enter a special mode in which the module mirrors incoming DQS at you so that you can find the delay value that works best
<mwk>
there is also write training, which in turn involves aligning DQ to DQS
<mwk>
it's done by experimentally writing to a scratch register on module and seeing if it works right
<promach3>
mirrors incoming DQS ???
<mwk>
and you have to do it separately for every pin of every module in the system; while it is technically a DDR3 requirement, you can get away with skipping it if your clock frequency is slow enough
<mwk>
esp. in a single-module system that doesn't have the fly-by routing problems
<mwk>
yea, the module enters a special mode where it just sends received DQS back at you using the data lines
<mwk>
... or something like this, I don't remember the details
<promach3>
wait, I do not quite understand the "mirror" thing
<promach3>
why do we need "mirrored" DQS for write levelling ?
<zyp>
it's a test mode
<promach3>
huh ?
<zyp>
to test the delay you tell the module to send back the incoming signal as is, so you can measure how long it takes from you send it until you see it return
<jjeanthom>
promach3, for which FPGA family are you writing a DDR3 controller?
<promach3>
jjeanthom: orangecrab
<promach3>
zyp: a loopback ?
<zyp>
yes
<promach3>
but what is the purpose of such specific loopback delay test ?
<zyp>
to measure the delay
<zyp>
you can't correct for the delay unless you know what it is, and you won't know it without measuring
<mwk>
it's more complicated than just sending it back; what really happens in write leveling mode is that the module captures the value of CK on rising DQS edge and sends back *this* latched value
<mwk>
so you can measure the delay in only one direction
<mwk>
but yeah, the general idea is to measure delay
<mwk>
you start from aligned DQS/CK, you skew it in both directions and measure when the captured value changes from 0 to 1
jjeanthom has quit [Quit: Leaving]
<mwk>
and then you have an idea of what the skew is and can compensate for it
jjeanthom has joined ##openfpga
jjeanthom has quit [Remote host closed the connection]
jeanthom has joined ##openfpga
<promach3>
mwk: I think I got your idea
<promach3>
how to measure ? using oscilloscope ?
<promach3>
is it possible to have edge detection using FPGA ?
<mwk>
promach3: please just read about how it's supposed to work; I don't have time to stream the whole DDR3 standard to you using IRC
<promach3>
ok
<mwk>
but the general idea is that detection is easy, just use normal FFs to capture the value and see if it's 0 or 1
<mwk>
skewing the output is the hard part, and it involves FPGA-specific delay elements
<mwk>
don't know much about ecp5; on xilinx you'd use one of the many IODELAY primitives
<promach3>
why need to skew the signals from FPGA side ?