finsternis has quit [Excess Flood]
finsternis has joined ##openfpga
rohitksingh has joined ##openfpga
<hackerfoo> ZirconiumX: Currently non-existent source code is also undocumented and without examples :)
<hackerfoo> It might be easier to add documentation and examples to LiteDRAM.
<ZirconiumX> First I need to understand LiteDRAM. Writing the controller gives me an excellent understanding of it that I wouldn't have as an outsider.
<hackerfoo> We (Symbiflow) are using it to build an SoC.
<hackerfoo> ZirconiumX: Maybe this could help:
<ZirconiumX> No, it's not helpful, because it's 11,000 lines of Migen-generated Verilog instead of the original Migen code, hackerfoo
<hackerfoo> Ah. Maybe I can find the Migen code.
rohitksingh has quit [Ping timeout: 245 seconds]
Morn_ has quit [Ping timeout: 276 seconds]
Morn_ has joined ##openfpga
genii has quit [Quit: Welcome home, Mitch]
juri__ has joined ##openfpga
juri_ has quit [Ping timeout: 268 seconds]
Morn_ has quit [Ping timeout: 276 seconds]
Morn_ has joined ##openfpga
snappy has quit [Quit: WeeChat 2.5]
lutsabound has quit [Quit: Connection closed for inactivity]
lopsided98 has quit [Quit: Disconnected]
lopsided98 has joined ##openfpga
dh73 has quit [Quit: Leaving.]
dh73 has joined ##openfpga
dh73 has quit [Client Quit]
OmniMancer has joined ##openfpga
nrossi has joined ##openfpga
Jybz has joined ##openfpga
Jybz has quit [Quit: Konversation terminated!]
rohitksingh has joined ##openfpga
<_florent_> ZirconiumX: that's true that LiteDRAM is not heavily documented and probably not that easy to start with, that's something i'm aware of and that i want to improve
<_florent_> Most of the designs are using it directly integrated with LiteX (so using the Migen code, as the example provided by hackerfoo)
<_florent_> but that's also possible to use the generator to configure the core and create a standalone verilog core that you can just reuse as any other verilog core in your design
<_florent_> for example, in, LiteDRAM is used as a standalone core:
<_florent_> i'm happy to help you getting started if you want to use it and explain how it works.
<_florent_> But i can also understand you want to try to write your own :) (but that's not something that can be done in a few days: just getting the 7-Series PHY working, doing the DDR3 initialization, read/write leveling was already a few weeks of efforts and that's not even the "controller" itself)
<hackerfoo> It's also easier to write your own when you have a working example to compare against.
<kc8apf> _florent_: I'm trying to comprehend LitePCIe and the lack of docs and comments is painful
<_florent_> kc8apf: yes i know, sorry for that, that's really something i want to improve on the different cores (any help is also welcome :) or even just feedback on the difficulties understanding/using them)
<kc8apf> I _think_ I understand LitePCIe's DMA but there is a lot going on
<kc8apf> any notes on that would be very helpful when I get back to this project next week
<_florent_> kc8apf: ok, i'll add documentation to it this week
<kc8apf> thank you
<kc8apf> I'll try to write things up as I figure them out and send PRs as well
<kc8apf> i'm off for the night
<_florent_> ok thanks, good night
<ZirconiumX> _florent_: my goal here is to use nMigen and pysim, so serialising a LiteDRAM core to Verilog is unhelpful
<_florent_> ZirconiumX: so you want to be able to simulate your DDR3 controller in nMigen?
<ZirconiumX> Or alternatively to use yours
<ZirconiumX> But yeah
<_florent_> the difficulties with simulations are that: 1) you need a model of your DDR3 (i generally use Micron's ones) 2) you need model or simulation libraries of the primitives you are using in the FPGA
<keesj> kc8apf my i suggest you start documenting the thing yourself and ask questions when things are not clear? I am also not a pro but in some cases I might be able to help
<_florent_> so pure nMigen simulation is difficult for that (you could create models in nMigen, but then you are checking things against your own understanding of how it should work)
<ZirconiumX> Perhaps, but my target here is the ECP5, which as I understand it already has a good simulation library
<_florent_> is a simplified Migen model of a SDRAM, but it's not currently supporting DDR
<mwk> ZirconiumX: no, you need a model of the *RAM*
<ZirconiumX> I can read, mek
<ZirconiumX> *mwk
<mwk> ugh
<mwk> it seems I can't :p
<mwk> sorry
<ZirconiumX> Go get some coffee :P
<mwk> that's a good idea
<_florent_> ZirconiumX: you could probably use Verilator for your simulation, but i don't know if nMigen already supports it natively
<ZirconiumX> _florent_: it does not.
<mwk> on the bright side, I finally managed to get up at a reasonable-ish combination of (time, time spent sleeping) today, for the first time in several weeks
<ZirconiumX> I could use Verilator, but then that involves dumping the nMigen/oMigen code to Verilog and then turning it to C++, and it gets increasingly indirect.
<mwk> let's hope this lasts
<ZirconiumX> Congrats, mwk
<_florent_> ZirconiumX: in your case, if you want to use LiteDRAM, i would just do the simulation of your system with a nMigen memory model that has an Wishbone/AXI interface, and only use LiteDRAM core when you are targeting hardware, but the behavior will not be exactly what you'll have on hardware (bandwidth, latency,...)
<ZirconiumX> That's probably going to be the mother of all simulation/synthesis mismatches.
<mwk> *thinking* speaking of simulation models, I want a verilog sim model of human sleep, so that I can plug in the sleep deprivation numbers for the last week and yesterday's sleep binge and see if I'm actually close to getting it right
<mwk> ah well, probably going to screw up soon anyway
<mwk> [end of mwk's pre-coffee morning thoughts]
<_florent_> ZirconiumX: it seems you are on your own then :) (since i don't have anything better to suggest)
<daveshah> FYI, the FOSS simulation library for ECP5 is not very good
<daveshah> I need to work on it, right now none of the DDR3 primitives have models
<daveshah> The vendor models aren't really Verilator compatible afaik, but this might be unavoidable for some of the DDR3 stuff due to delays etc
<OmniMancer> Are the yosys equivalence checking commands for showing that one design is equivalent to another?
juri__ has quit [Ping timeout: 240 seconds]
<daveshah> OmniMancer: yes, although don't expect them to work miracles
<daveshah> You can do it either with miter and sat or the equiv_ passes
juri_ has joined ##openfpga
<daveshah> If you want to test a Yosys pass then there is equiv_induct
<daveshah> *equiv_opt
<OmniMancer> In that context what does structural equivalence mena?
<OmniMancer> mean*
scream has quit [Write error: Broken pipe]
jfng has quit [Read error: Connection reset by peer]
swedishhat[m] has quit [Remote host closed the connection]
henriknj has quit [Remote host closed the connection]
xobs has quit [Read error: Connection reset by peer]
<OmniMancer> daveshah: how does nextpnr deal with global clock meshes?
pepijndevos[m] has quit [Write error: Connection reset by peer]
synaption[m] has quit [Remote host closed the connection]
<daveshah> OmniMancer: just as wires and pips like anything else
<daveshah> You might need a custom pass to promote them, and possibly route them correctly in some cases
<daveshah> But it's usually not much code
<OmniMancer> so does custom code try to put clock like nets on them?
<daveshah> Yes
<daveshah> Although because of how Trellis works its a bit more complex than it needs to be
m4ssi has joined ##openfpga
<OmniMancer> does ECP5 have the global reset net?
<daveshah> Yes, that is dealt with by Yosys, but only if a user uses the GSR primitive at the moment
<OmniMancer> ah okay
<OmniMancer> I think in the Eagle you can drive the CE and SR signals from some of the 16 global clock signals in the perquadrant trees
<daveshah> ECP5 has something like that as well as a special GSR, but I was having some weird issues actually making it work
<OmniMancer> If I read the datasheet correctly aswell there are some blocks for gating clocks in a glitch free manner too
<OmniMancer> Roughly how many LUTs are needed for a riscv that can run linux?
<daveshah> Dolu was hoping that SaxonSoc could fit on an ice40
<daveshah> Not sure what it actually needs but 12k ish LUTs should definitely be enough
<OmniMancer> Cool, should fit in an EG4S20 then
<OmniMancer> Can Linux boot in 8MB or ram these days?
<daveshah> I saw a Linux demo on the K210, so yes
<daveshah> That was nommu. Not sure how using the mmu affects memory footprint
<OmniMancer> Oh interesting
<OmniMancer> do the FPGA SoCs that run linux typically have an MMU?
<daveshah> VexRiscv does, yes
<daveshah> In fact you may have to use the mmu for VexRiscv because I don't know if 32 bit nommu is working yet
<daveshah> I have always used the mmu
<OmniMancer> I think the only reason the k210 doesn't use an mmu is that the core is using an earlier version of the RISC-V spec or something so it doesn't match current requirements
<_florent_> kc8apf: re LitePCIe: i hope it will be less painful with
henriknj has joined ##openfpga
jfng has joined ##openfpga
xobs has joined ##openfpga
pepijndevos[m] has joined ##openfpga
swedishhat[m] has joined ##openfpga
synaption[m] has joined ##openfpga
scream has joined ##openfpga
<OmniMancer> daveshah: is there a way to get the generic backend to consider a wire driven by a constant?
<daveshah> No, there isn't
<daveshah> It will insert LUTs as constant drivers
<daveshah> This is a possible improvement in the future
<OmniMancer> so there is no point telling it about the gnd driver then
<OmniMancer> is there any generic tool for deciphering the weird SoP formulas into LUT bits?
AndrevS has joined ##openfpga
<mwk> I had this idea of making two dummy global cells of types GND and VCC driving const wires some dummy tile
<mwk> then represent const wires everywhere else as pips from these two wires
<daveshah> That's exactly what I do in my Xilinx nextpnr support
<daveshah> It's a bit more icky, because it turns out that having millions of pips from one wire makes routing slow
<mwk> right, that's exaclty what I wanted to do it as well
<daveshah> So it is actually a two-layer tree of pips
<mwk> oh
<mwk> ugh.
<daveshah> effectively, one from the global wire to row-wide wires
<mwk> but oh well, whatever works
<daveshah> then another from the row wires to actual const drivers
<daveshah> I know it's horrible, deduplication adds complexity here too
<pie_> wat <azonenberg> so there's actually two layers of silicon with nanoscale interconnects, not pcb traces, connecting the adjacent dies
<azonenberg> pie_: think of it as a tiny PCB made out of silicon
<azonenberg> where each "IC" is a flip chip bumped die
<azonenberg> no active components on the carrier die
<azonenberg> just wiring
<OmniMancer> daveshah: does it make routing slow because the routing has to consider the pips?
<daveshah> yeah, it has to search all of them
<daveshah> the solution would be to route these signals backwards
<daveshah> but that has its own problems too
<pie_> wow how do i keep ending up like a week back in scroll
<pie_> azonenberg: i guess that makes sense
<pie_> still crazy :p
<OmniMancer> daveshah: could you provide that signal as a prerouted macro? :P
<daveshah> No, that wouldn't help
<daveshah> because you wouldn't know where it is actually used
<OmniMancer> oh is it trying to route forward from gnd to where it needs to be connected?
<OmniMancer> I can see how that won't really do well yes
<daveshah> Yes, this is normally the best way to route stuff
<OmniMancer> and backwards is fine for this since the "routing" is entirely local actually
<daveshah> Yeah
<OmniMancer> alternately being able to make N gnd wires that are considered to have a constant 0 for each time such a pip is needed would work?
<daveshah> Well the router would need to be modified to understand that
<daveshah> And it would result in a massive database
<OmniMancer> indeed
<OmniMancer> is conditional backwards routing feasible to implement?
<daveshah> I have a partial implementation in router2 already
<daveshah> However, I still feel like a wire with millions of pips is a problem waiting to happen somewhere
rohitksingh has quit [Ping timeout: 245 seconds]
<kc8apf> _florent_: yes, that helps immensely
<zignig> daveshah: good work with nextpnr , how many platforms have partial coverage now ? 4+ ?
<daveshah> 5 if you count pepijndevos's work too
<daveshah> iCE40, ECP5, xc7, xcup and Gowin all have at least picorv32 working
<zignig> yeah gowin is going to be interesting , if you can get sub $10 dollar boards that have enough luts..
* zignig rubs hands together and Mwhahaaha. ;)
<zignig> daveshah: does most of your work boil down to whacky rectangle packing with wires ?
<daveshah> Not really
<zignig> I suppose my question is; how do you visualize the things that you build?
<daveshah> I don't really visualise stuff much, tbh
rohitksingh has joined ##openfpga
<OmniMancer> zignig: the anlogic board isn't much above 10 dollars and has probably enough luts?
<daveshah> The upduino is also at a similar price point, but doesn't include a programmer
<daveshah> The gowin price of $5 with onboard programmer is pretty good
<OmniMancer> the anlogic board has a programmer but it could do with someone writing a better open source firmware for it
<daveshah> I think the gowin situation is similar
<OmniMancer> Hmmm I am seeing some routes that the anlogic tool has produced that have me worried
<OmniMancer> it seems as if it is connecting an output to the middle of an interconnect wire :/
* pepijndevos is struggling massively to get stuff working on this $5 board you're all talking about
<pepijndevos> daveshah, xc7=Xilinx 7? xcup=???
<daveshah> xcup = ultrascale+
<pepijndevos> If you were a reasonable FPGA designer, imagine you're building different size versions of the thing...
freemint has quit [Read error: Connection reset by peer]
<pepijndevos> Okay, you just make a smaller thing, like so, right?
<pepijndevos> And then for small packages you're not going to use all the IOB, fine.
<pepijndevos> But wait... what if you want to cram a really tiny FPGA in a package with waaay to many pins?
<pepijndevos> Oh, I know, I'll make an IOB that has TEN pins....
<daveshah> lol
<whitequark> um, what
<mwk> whitequark: there's a special kind of IO tile in small gowin devices that has 10 pins crammed in it
<daveshah> I guess without any IOLOGIC?
<mwk> it has no usual serdes/flops
<mwk> only bare input/output/oe
<whitequark> gowin has serdes in io tiles?
<mwk> yes
<mwk> well, it has the thing that xilinx calls serdes
<mwk> ie. parellel/serial converters
<pepijndevos> If you look at the picture I posted, the red IOB are not used at al. So like... we put all these IOB with sweet differential pairs and what not, but we'll disable half of them and use MEGA IOB
<mwk> not the gigabit stuff
<mwk> *sigh* can't we all agree at least on terminology
<pepijndevos> hah no
<whitequark> right, i figured
<whitequark> i call those blocks "XDR"
<whitequark> and actual high speed stuff with CDR "SERDES"
<whitequark> but that's just me
<mwk> XDR?
<whitequark> you know, like DDR, but generalized for gearbox ratio more than 2
<mwk> huh
<mwk> and CDR?
<pepijndevos> aka eXtra data rate
<whitequark> mwk: clock and data recovery
<mwk> oh, that's a nice term
<whitequark> the part of a SERDES block that extracts the clock embedded in data
<OmniMancer> so things that do stuff like 8b10b?
<mwk> pepijndevos: any clue on how many of these red io tiles are actually attachment points for one-off blocks?
<whitequark> yes
<OmniMancer> or tmds
<daveshah> Just to confuse things, I think the (non-existent) ECP4 was going to have hard CDR blocks to combine with regular IO pins for SGMII
<pepijndevos> whoa, there are FPGAs with embedded hardware clock recovery?
<mwk> I don't think I've seen any FPGA with tmds support
<whitequark> pepijndevos: sure, ECP5
<OmniMancer> pepijndevos: for serial channels yes
<mwk> pepijndevos: uhhh, like all non-low-end ones
<whitequark> yeah
* pepijndevos goes off building radio receivers
<mwk> xilinx calls these "gigabit transceivers"
<mwk> they're meant for serial protocols like PCIe and SATA, and you cannot really use them for anything else
<daveshah> Usually with a significant minimum frequency, btw
<daveshah> ECP5 is something like 200Mbit/s min
<mwk> notably, TMDS is a shitshow on FPGAs
<OmniMancer> alas
<pepijndevos> mwk, I think on the bigger Gowins, the special attachement points are colored blue, but I might be wrong.
<mwk> pepijndevos: but the -1 part has some singleton hw as well, doesn't it?
<pepijndevos> eh, does it? Maybe the S or Z variants have some funky stuff onboard.
* mwk consults notes
<mwk> pepijndevos: well there's this Ufb thing, whatever it is
<mwk> it's stuffed in the corners (upper left and upper right) and in the PLL tile though
<mwk> so doesn't take up any IO tiles
<pepijndevos> ... user feedback?
<mwk> so uh... I got nothing
<mwk> NFI
<OmniMancer> daveshah: how do I work out why nextpnr-generic doesn't like my pip?
<daveshah> What do you mean, doesn't like?
ZipCPU|Laptop has quit [Ping timeout: 276 seconds]
<pepijndevos> OmniMancer, what I did is add code.interact(local=locals()) and poke at the ctx
<OmniMancer> daveshah: this kind of doesn't like: "IndexError: _Map_base::at"
<daveshah> Ah, you need to create the wires first
<daveshah> This could probably be a better error
<OmniMancer> AFAIK the wires exist
<pepijndevos> From my experience: no, they don't, quite, really, exist in the way I thought haha
<OmniMancer> Oh no I know why
<OmniMancer> I haven't prefixed the names with the tile location
<pepijndevos> hehehe oops
pie_ has quit [Ping timeout: 276 seconds]
<OmniMancer> so its trying to connect ce0 to e1beg2 but that isn't what the wires are named
<daveshah> Aha
<OmniMancer> Is it going to yell at me if it cannot place any IOBs btw?
<OmniMancer> So far it seems to have mostly just consumed all CPU time
rohitksingh has quit [Ping timeout: 246 seconds]
<OmniMancer> Ah I see it does
<OmniMancer> daveshah: are the IOBs inserted by nextpnr?
<daveshah> Yes, they should be
<OmniMancer> Can I get it to not do that?
<daveshah> OmniMancer: adds --no-iobs as an option
<OmniMancer> ah
<sorear> nextpnr does 7 and us+ but not us?
<daveshah> Well, the rapidwright flow would support it with minimal changes
<daveshah> It's just I have no hardware and no interest in making those changes
<daveshah> The whole flow is a proof of concept anyway
<sorear> oh, there’s no self-contained us+ or 7 yet?
<daveshah> There is self contained for xc7 using xray
<daveshah> But there is no public bitstream documentation for xcup yet
<daveshah> Some people at Manchester also did something with nextpnr and Ultrascale
<daveshah> +
<daveshah> But they didn't publish the bitstream docs in a useful format
<OmniMancer> daveshah: hmmm I am now getting another pip addition failure but I need to sleep
<pepijndevos> Oh hey, imagine you're a EDA software dev that needs to design a file format for FPGA bitgen. Cool, so you need a way to map some abstract features to bit locations, right?
<pepijndevos> Hmmm, I know, so for every tile I'll make tables that map from features to bits :))
<OmniMancer> daveshah: is there any reason for the map error if both wires do exist?
<OmniMancer> oh no I see it
<pepijndevos> But instead of storing the bit location in the table, I'll make another table, and pack the XY position in a single integer in decimal notation.
<OmniMancer> there is a ,
<pepijndevos> And haha, you thought those features would be consistent between FPGAs? Nope, I'll just make them different just because so there are 3 layers of meanlingless mapping before reaching a bit location
<OmniMancer> pepijndevos: how wonderful :/
<pepijndevos> Note how the tile locations are exactly the same in GW1N-1 and GW1NR-9, but for some obscure reason most of them are off by one between them.
genii has joined ##openfpga
Asu has joined ##openfpga
<OmniMancer> nextpnr sure likes eating memory
<daveshah> That is unfortunately due to how the generic arch works atm
<daveshah> Each pip needing a name doesn't scale very well
<OmniMancer> nearly 2 GB
<OmniMancer> and it isn't done with the interconnect
<daveshah> Yeah, all I can suggest is shorter pip names
<daveshah> (for comparison, a proper arch like iCE40 or ECP5 needs about 128MB minimum RAM)
<daveshah> Probably the long term solution is to make generic pip/wire identifiers a (location, name idstring) tuple
<daveshah> this way where a name is used in multiple tiles the name string is only stored once; then the identifier is a total of 12 bytes
<OmniMancer> yes having the location be part of the id would help with that immensely
<OmniMancer> I got a place and route successful, but hadn't changed one of the later scripts so it errored then
<OmniMancer> Behold, blinky
<OmniMancer> now to sleep
<daveshah> Nice!
OmniMancer has quit [Quit: Leaving.]
<pepijndevos> Wohooo
<pepijndevos> congrats
<pepijndevos> oh omni left already
m4ssi has quit [Remote host closed the connection]
emeb has joined ##openfpga
pie_ has joined ##openfpga
pie_ has quit [Remote host closed the connection]
pie_ has joined ##openfpga
kernlbob has quit [Ping timeout: 250 seconds]
craigjb has quit [Ping timeout: 245 seconds]
craigjb has joined ##openfpga
brizzo has quit [Ping timeout: 250 seconds]
grantsmith has quit [Ping timeout: 250 seconds]
brizzo has joined ##openfpga
grantsmith has joined ##openfpga
pie_ has quit [Ping timeout: 246 seconds]
pie_ has joined ##openfpga
<azonenberg> daveshah: that isnt the case already?
<azonenberg> it would not surprise me to know that one or more of the reasons the xilinx chip databases are so massive is that they have some equally inefficient duplication of resources :p
<mwk> you'd be correct
<mwk> I don't know how Vivado stores its database, but ISE is... not great at deduplicating
<daveshah> It's a shame given how repetitive Xilinx arches are really
pie_ has quit [Ping timeout: 250 seconds]
Dolu has quit [Ping timeout: 240 seconds]
<azonenberg> daveshah: i know
<azonenberg> like, thinking about how i'd build the architecture files for a typical part
<azonenberg> i honestly think i could get the arch spec for at least the core fabric of a smaller xilinx part to fit 100% in L3 cache
<daveshah> Yeah, it's definitely possible
<azonenberg> And adding new devices would be a few kB each
<azonenberg> bassically
<azonenberg> basically just describing the size of a clock region, number of clock regions and their arrangement
<daveshah> I haven't personally tried something like this yet
<azonenberg> and the set of columns
<daveshah> What makes it slightly tricky is the connections between different kinds of tiles
<azonenberg> still, couldnt you just create one copy of the config for "clb adjacent to dsp" or similar
<azonenberg> and just have pointers to that?
<azonenberg> i'd assume based on my experience with coolrunner that you have a few basic tile types and then just flip/mirror variants of them
<daveshah> Oh yeah, definitely
<daveshah> It's mostly a case of making sure that a doubly linked routing graph isn't too slow to traverse
<azonenberg> if it fits in cache, i'd suspect that the benefits from lower latency more than outweigh the extra fetches
<daveshah> Yes, almost certainly
<daveshah> It's mostly avoiding any actual n^2 type situations where data isn't available without a full search
pie_ has joined ##openfpga
rohitksingh has joined ##openfpga
Bob_Dole has quit [Ping timeout: 250 seconds]
Dolu has joined ##openfpga
<tpw_rules> so i've been thinking about a boneless register allocator too but i don't really understand graph coloring allocation
<tpw_rules> like assigning registers without control flow is trivial. but idk how to really do it with it. i was reading briggs' thesis and it dismisses some ideas which seem to work, maybe for performance reasons?
nrossi has quit [Quit: Connection closed for inactivity]
AndrevS has quit [Quit: umount /dev/irc]
Bob_Dole has joined ##openfpga
mumptai has joined ##openfpga
ZipCPU|Laptop has joined ##openfpga
ZipCPU|Laptop has quit [Remote host closed the connection]
Zorix has quit [Quit: Leaving]
Zorix has joined ##openfpga
Asu has quit [Remote host closed the connection]
genii has quit [Quit: Welcome home, Mitch]