Bob_Dole has joined ##openfpga
_whitelogger has joined ##openfpga
<gruetzkopf> i kinda want to build "The DMA way"
freemint has quit [Remote host closed the connection]
freemint has joined ##openfpga
_whitelogger has joined ##openfpga
Thorn has joined ##openfpga
<tpw_rules> ow i'm hurt by that rust pun
<tpw_rules> oh i thought it was on there already
<tpw_rules> anyway, it's a perfect application for my fpga HPS board
freemint has quit [Remote host closed the connection]
freemint has joined ##openfpga
<bluezinc> azonenberg: I believe you are mistaken. I'd consider it much more likely to be either an XC7V585 or an XC7VX485 part in an 1157 package (20 GTX, 1 for USBTX, 1 for USBRX, 18 LA channels).
<bluezinc> I don't see lecroy using a 901 package, because the ffg901 has no HP banks.
freemint has quit [Ping timeout: 260 seconds]
genii has quit [Quit: Morning comes early.... GO LEAFS GO!]
ZombieChicken has quit [Remote host closed the connection]
_whitelogger has joined ##openfpga
dh73 has quit [Quit: Leaving.]
X-Scale` has joined ##openfpga
X-Scale has quit [Ping timeout: 272 seconds]
X-Scale` is now known as X-Scale
Bike has quit [Quit: leaving]
X-Scale` has joined ##openfpga
X-Scale has quit [Ping timeout: 258 seconds]
X-Scale` is now known as X-Scale
<azonenberg> bluezinc: Hmm
<azonenberg> bluezinc: but then how do they phase the channels to each other?
<azonenberg> you can have all of the GTXes at a common multiple of the refclk, but the phase of the sampling clock vs the refclk could vary by up to an entire refclk cycle in discrete steps
<azonenberg> also usb tx/rx are one gtx, each gtx contains a tx and rx
azonenberg_work has joined ##openfpga
____ has joined ##openfpga
____ has quit [Quit: Nettalk6 -]
____ has joined ##openfpga
<sensille> anyone knows of an extended sd card that you put in one device, and has an interface so you have read-only access to it on another?
<sensille> would be a nice small fpga project
<whitequark> the sd card protocol is neither small nor nice
<sensille> probably 10 different protocols due to the long history
<sensille> at least the hardware would have to be small
<whitequark> oh, you can use an sd card extender
<azonenberg> you'd also be trying to read a filesystem that is halfway through being manipulated by another OS
<azonenberg> and might be inconsistent wrt cache flushing et
<azonenberg> have fun
<azonenberg> this is why android devices are moving to MTP
<whitequark> um
<whitequark> android devices haven't shared fat32 volumes for like five major releases
<azonenberg> because usb mass storage has the same problems, you had to unmount the sd card from the phone OS in order to mount it on the host
<azonenberg> ok, moved. Point remains
<whitequark> strengthens it, really
<sensille> sure, i'm aware of that
<sensille> the question is, would it be useful to anyone but me?
<whitequark> there's been people discussing this before, even in this channel i think
<whitequark> and there are even prototype tools that do this for embedded dev
<whitequark> might have seen commercial ones?
<whitequark> bottom line: yes
<whitequark> although i think you can go pretty far with just a jumper and a bunch of muxes
<whitequark> which conveniently avoids having to read the cursed shit that counts for sd card protocol
<sensille> hm. that might indeed be enough
<azonenberg_work> whitequark: so speaking of cursed things
<azonenberg_work> did i mention what i discovered about vivado the other day?
<azonenberg_work> It *aggressively* caches synthesized IP netlists. And doesnt seem to have great cache invalidation
<azonenberg_work> So it's possible to create a tree that compiles fine, and produces a working even when you reset the build and resynth/par your rtl
<azonenberg_work> you add that state to git, go find some regressions, go back to a clean tree after a hard reset to an older revision
<azonenberg_work> and that commit hash no longer compiles
<azonenberg_work> actually no wait
<azonenberg_work> it compiled fine
<azonenberg_work> then i flushed the cache and it stopped compiling
<azonenberg_work> because one of the IP source files wasn't in git, but vivado didn't actually check that the inputs were present and untouched before blindly using the cached netlist for P&R
<whitequark> ouch
<azonenberg> one of many reasons i avoid xilinx IP as much as i can
<azonenberg> all i use is the ILA for my personal projects, and only for debugging
<azonenberg> but this is a $sidegigclient project on a Zynq and there is no sane way to use a zynq without the IP integrator
<azonenberg> why they cant just make the axi interconnect core a parameterized module that you instantiate and set a few synth parameters on is beyond me
<azonenberg> why do all this code generation when a generate loop will do fine?
<azonenberg> (if you've ever considered using a zynq in a project, dont :P)
<edbordin> oh hey, it finished fuzzing :D
s_frit has quit [Remote host closed the connection]
s_frit has joined ##openfpga
<edbordin> We use this a National Instruments product with a Zynq in it and they only expose the fpga via some LabView pay-to-play proprietary dumpster fire. This makes me feel better about that fact.
s_frit has quit [Remote host closed the connection]
s_frit has joined ##openfpga
<swetland> azonenberg: what I ended up doing was writing a script to generate a wrapper exposing the various axi interfaces I needed, etc
<swetland> which works if you're okay with writing all your own axi glue, etc
<TD-Linux> sensille, another option is the toshiba flashair or trancend wifi cards
<sensille> and a micro SD adapter
OmniMancer has joined ##openfpga
Asu has joined ##openfpga
rohitksingh has joined ##openfpga
Xiretza has joined ##openfpga
<OmniMancer> daveshah: 16 global clocks is too many?
<daveshah> This was only 8, but the Xilinx clock rules are very complicated and nextpnr doesn't handle them all. And I guessed 8 clocks for 300 logic cells probably means something is wrong
<daveshah> I think the problem wasn't actually clocking but because some of these fanned out to logic too
<daveshah> I think actually Yosys shouldn't have been promoting things that drove one FF clock and one LUT input to a global at all, but that's another issue
<whitequark> i feel like clock promotion should be nextpnr's job
<whitequark> but i'm ignorant of the problems involved in that, so i might be just wrong
<daveshah> I think Yosys does it because ISE needs it
<whitequark> what about ice40?
<whitequark> or <insert any other arch>
<daveshah> that's done by nextpnr
<daveshah> ditto ecp5
<daveshah> it's just synth_xilinx that does clock promotion
<daveshah> which hitherto has been more focused on synthesis for vendor pnr than open pnr
<whitequark> oh, sorry, brain not fully on yet. i was thinking of using DFFCEs on ice40
<omnitechnomancer> How many global clocks does ECP5 have?
<daveshah> ah
<mwk> whitequark: yeah, it was done for ISE
<daveshah> omnitechnomancer: 16 per quadrant
<whitequark> which... actually should be done in nextpnr too ideally, but that's even harder
<omnitechnomancer> ah cool, similar to Anlogic in another way :P
<daveshah> I have been wondering about doing more physical optimisations in nextpnr at some point
<daveshah> The start point would be retiming, but timing-based LUT repacking and DFF control set optimisation could be interesting to look at too
<whitequark> dffmince is such a crude hack, and it hits pretty badly because of how slow ice40 routing is (I think because of that?)
azonenberg_work has quit [Ping timeout: 260 seconds]
<omnitechnomancer> daveshah: what can the global nets drive besides clocks?
<daveshah> 8 of them can drive resets, 8 CE and 8 logic IIRC
<omnitechnomancer> ah yep, sounds familiar
<omnitechnomancer> I think the eagle might be the same in what can be driven from them...
<daveshah> Hah
<omnitechnomancer> I am not sure if they can directly drive locals
<omnitechnomancer> but you can connect the two clocks into the routing fabric
<omnitechnomancer> which is a bit weird
azonenberg_work has joined ##openfpga
<omnitechnomancer> Actually I wasnt fuzzing any gclk to local connections since I had not thought they could occur
<omnitechnomancer> so lets see
<omnitechnomancer> daveshah: how many actual clock inputs per tile does ecp5 have? can each slice have an independent clock?
<daveshah> Nope, two clock inputs per tile
<ZirconiumX> I was actually curious about the number of global clock connections for CV since I didn't know: 16
<ZirconiumX> Though it seems to be split up into a clock hierarchy
<omnitechnomancer> daveshah: and the clock inputs drive slices A,B and C,D?
<daveshah> No, each slice can be driven from either clock
<omnitechnomancer> ah okay interesting
<daveshah> CrossLink-NX works how you describe though
<omnitechnomancer> as far as I can see Eagle is just one clock for 0 and 1 (mslices) and one clock for 2 and 3 (lslices) no other choice in the matter
<omnitechnomancer> they have independent invert bits though
<daveshah> So CrossLink NX is interesting - it doesn't have invert bits, but rather one bit for rising edge clock and another for falling edge clock
<daveshah> and there is indeed an undocumented DDR flop mode that sets both bits
rohitksingh has quit [Ping timeout: 260 seconds]
Stary has quit [Ping timeout: 260 seconds]
<omnitechnomancer> interesting
freemint has joined ##openfpga
azonenberg_work has quit [Ping timeout: 260 seconds]
Asu has quit [Remote host closed the connection]
edbordin has quit [Remote host closed the connection]
genii has joined ##openfpga
OmniMancer has quit [Quit: Leaving.]
Asu has joined ##openfpga
____ has quit [Quit: Nettalk6 -]
<Marex> daveshah: mithro: So I started looking at 5M40Z again, did the project chibi ever get anywhere ?
<Marex> seems like that is the most suitable altera part to start with in the end
bgamari_ has joined ##openfpga
bgamari_ has quit [Ping timeout: 260 seconds]
freemint has quit [Ping timeout: 245 seconds]
bgamari_ has joined ##openfpga
dh73 has joined ##openfpga
mumptai has joined ##openfpga
<Marex> hmmm, so the POF for 5M40Z is 7859 bytes , wow
<mithro> Marex: you would have to ask rqou -- I think he got some stuff done but never got around to making it usable
<Marex> mithro: I saw the chunks of python
<mithro> Marex: generally the actual work to get a proof of concept is only a very small part towards making a usable toolchain others can use -- documentation, building community, etc is like 90% of the work
<Marex> mithro: point being, 8 kiB bitstream is easier to analyze
<Marex> and it's literally 24 copies of the same
<Marex> some of which is not used, so even better
<mithro> Marex: well that is good potential first target then I guess?
<Marex> mithro: I was being dumb for attempting C-IV back then
<mithro> I haven't seen rqou around at all lately, but it can't hurt to email him
<Marex> mithro: so this 5M40Z is I believe the same die as 5M80Z, 5M160Z and 5M240Z , each having different amount of LEs (you can guess from the number between 5M and Z how many)
<Marex> since one LAB has 10 LEs, the smallest part has 4 active LABs , the next one 8 active LABs etc
<Marex> so I would guess, that 4 out of 24 LABs are active on the smallest die and the rest is just ... nothing
<Marex> I would expect quartus sets most of the bitstream to 1 or some idle interconnect
<Marex> just thinking out loud
<Marex> it almost seems like the 5M40Z is software-limited to 4 LABs, the quartus PnR places LEs in four LABs, but in random locations in the chip
<Marex> uh
_whitelogger has joined ##openfpga
<ZirconiumX> Marex: still around?
<Marex> ZirconiumX: yes ?
<ZirconiumX> I'm working on the Cyclone V, so we can probably share a bunch of tips
<Marex> ZirconiumX: we discussed in the past, I was looking into C/IV before rqou was even around :)
<Marex> ZirconiumX: I didn't have time to finish that, but I have some notes still
<ZirconiumX> Ah, fair
<Marex> ZirconiumX: and yes, I know you work on CV
<ZirconiumX> I got the ALM bits in a LAB figured out at least
<ZirconiumX> ~~that's my one achievement~~
<Marex> ZirconiumX: also note that I have purely software background and FPGA is a hobby, so I might be completely wrong in the terminology department
<ZirconiumX> Likewise, I'm no electrical engineer
<Marex> well, wow :)
<ZirconiumX> I don't have the fuzzing process fully automated though, even though I should
<Marex> ZirconiumX: I think the C-IV and older are much simpler, no ? They only have LUT4 per LE
<ZirconiumX> Indeed, but I have no EP4C chips to test with :P
<Marex> ZirconiumX: I don't have any 5M40Z either
<ZirconiumX> What I *do* have is a semi-functioning Yosys synthesis frontend
<Marex> but I have to wonder how I can fuzz this one automatically, back then I was doing some arcane diffing of the bitstream dumps, but I'm not sure that's the best approach today
<Marex> also, is there a tool to map the bits, like what the x-ray had ?
<ZirconiumX> My friend wrote a bitstream diffing tool called horrortable
<ZirconiumX> It's how I got the LUT bits
<ZirconiumX> It could probably be hacked to fit a LUT4 though
<Marex> considering that the 5M40Z bitstream is 8 kiB, I could probably write my own
<ZirconiumX> True, I suppose.
<Marex> but there should be some way to coerce quartus into tweaking LUT content from command line, right ? quartus_cdb can I think dump such information ...
<Marex> but there was something, I think one could've frobbed with the .qsf file to achieve that
<Marex> that's what I did with the C/IV
<Marex> hm, maybe I can generate the qsf altogether for this small part
<ZirconiumX> Marex: Try quartus_cdb --vqm
<ZirconiumX> That gives you a Verilog netlist for it
<Marex> oh
<ZirconiumX> That being said, you would want as little extra noise for it as possible
<Marex> ZirconiumX: wihch you can do, by frobbing with the chip planner , which tells you how to "fixate" LUTs to specific cells , and then generate a QSF (or QPF?) with those statements , just slightly adjusted
<Marex> I think there's even a statement which allows you to set specific LUT to specific mask
<ZirconiumX> Marex: If you use VQM_FILE, you can instantiate cells directly
<Marex> ZirconiumX: well that I didn't know :-)
<ZirconiumX> I don't know the name of the Max V cell, but I'd guess it's something like maxv_lcell_comb
<Marex> ZirconiumX: maxv_lcell
<ZirconiumX> That works too
<Marex> ZirconiumX: I will take a look ; so which allows me to basically plant a cell at the specific position in the bitstream and then synthesise the result ?
<Marex> or rather, specific location in the floorplan...
<ZirconiumX> Marex: to plant it at a specific point you'll need to use set_location_assignment in the .qsf
<ZirconiumX> By the way, a tip I learned from mwk was that there's a more efficient way of extracting LUT bits
<ZirconiumX> Compared to e.g. one-hot
<mwk> meow?
* ZirconiumX pets mwk
<mwk> oh, we're talking fuzzers
* mwk purrs
<Marex> ZirconiumX: do tell ?
<Marex> ZirconiumX: btw is the cell on C/V also 35bits x 8bits large ?
<Marex> ZirconiumX: that's what it was on C/IV
<tnt> daveshah: btw, I can confirm that the special IO works fine with new treillis and nextpnr (tested on actual hw).
<ZirconiumX> So, a LUT4 has 2^4 = 16 combinations.
* mwk should really write that algorithm up in some document some time
<ZirconiumX> Let's take an all-zero LUT as a baseline control
<mwk> maybe even with some of the hacks that I layered on top of it
<ZirconiumX> Then, imagine if you arranged the bits for (0+1) to (16+1) in columns, so it looks like this
<ZirconiumX> *(0+1) to (15+1)
<ZirconiumX> If you now produce 5 bitstreams, each with a LUT mask of one of the digit columns, you can then permute it to read the LUT bits off horizontally
<ZirconiumX> And you've done it in 6 bitstreams instead of 17 for one-hot
<ZirconiumX> This scales even better the more LUT bits you try to do at once
<Marex> ZirconiumX: makes sense, although I figured out how to calculate the locations of the LUTs in the C/IV bitstream and then I did the whole thing in a couple of runs
<Marex> ZirconiumX: the MAX V is even easier, there's 40 LEs in total here :-)
<Marex> and the compilation run, well ... it's seconds
<ZirconiumX> Again, I had a tool to do it for me :P
<Marex> ZirconiumX: this must be automated
<mwk> also, with a good enough batching tool, you can lioterally RE half the chip in one batch
<ZirconiumX> <Marex> and the compilation run, well ... it's seconds <--- Quartus has never been that fast for me
<Marex> ZirconiumX: on C/V, it was slow for me too
<ZirconiumX> It's like two minutes a run for even a single LUT at a time
<mwk> because if you think about it, you can apply this idea not just to lut bits, but to any sort of a binary feature
<Marex> jupp, for CV, it's painfully, aggravatingly slow
<Marex> although I only ever used CV with SoC
* ZirconiumX should hire mwk to RE the Cyclone V at this rate
<ZirconiumX> I am terrible at this >.>
<Marex> ZirconiumX: how big is the LE cell on C/V ?
<Marex> ZirconiumX: on C/IV it looks like 35bits x 8bits , from my notes
<ZirconiumX> I don't have definite dimensions as such, but I know the approximate range of the LUT bits
<Marex> ZirconiumX: there's interconnect somewhere in there too
<Marex> more like , the output of a LE has some mux to attach to the interconnect row and column and LE-local , these things
<Marex> it's usually on the sides of the LE
<ZirconiumX> I know :P
<ZirconiumX> 83393 for an ALM is way bigger than that :P
<Marex> ZirconiumX: so see that lab.txt , for me the LE looked like there were two, back-to-back
<daveshah> tnt: good to know, thanks for testing
<tnt> daveshah: does that pin behave like the other ? (like, does it have io registers and stuff like that ?)
<daveshah> Yes, it does
<daveshah> The main reason it is odd is that it is the only pin on the chip that doesn't have a 'B' side
Jybz has joined ##openfpga
ym has quit [Quit: Leaving]
<Marex> ZirconiumX: btw, if I hexdump -vC the .pof file, there;s an interesting pattern, a column of ...
<Marex> ff fe ff ff ff fd ff ff .... the 0xfe and 0xfd are always in the same column
<Marex> might be some hint
<Marex> maybe this one is simpler somehow
<Marex> oh wait ...
<Marex> it's like 0xfe ff ff ff | 0xfd ff ff ff ... repeat
<ZirconiumX> Maybe a syncword?
<ZirconiumX> Actually
<Marex> nope
<ZirconiumX> Marex: You want an RBF, not a .POF
Stary has joined ##openfpga
<ZirconiumX> .POF is not an actual bitstream
<Marex> ZirconiumX: POF is the programmer object file, no?
<ZirconiumX> (Programming Object File)
<ZirconiumX> Yes, but RBF is *the bitstream itself*
<Marex> I am not sure if there is RBF for CPLD
<Marex> maybe cpf can help ?
<ZirconiumX> Well, try running quartus_cpf on it and see
<ZirconiumX> Yeah
<Marex> ZirconiumX: nope, there's no SOF
<ZirconiumX> I said RBF :P
<Marex> ZirconiumX: you need SOF to generate RBF , no ?
<ZirconiumX> No
<Marex> ZirconiumX: so what is your input to the CPF then ?
Jybz has quit [Quit: Konversation terminated!]
<Marex> "POF file is a header plus raw binary data. I think there is no such a document. Converting pof to rbf, you can know where is raw binary data area."
<ZirconiumX> You can pass `set_global_assignment -name GENERATE_RBF_FILE ON` in the .qsf.
<Marex> ZirconiumX: keep in mind, the MAX V is not SRAM-based
<ZirconiumX> Marex: And? The point of a *raw bitstream file* is that it has no metadata to bother with.
<Marex> ZirconiumX: well I know what RBF is, except apparently quartus refuses to chuck one out for non-SRAM-based devices
<ZirconiumX> Mmm. That's not great.
<Marex> ZirconiumX: but see above, POF is likely like RBF
<Marex> every 896 bytes, there's something, looks like CRC
<Marex> RBF also had that IIRC
<Marex> so POF is probably a header and payload
<Marex> and the project chibi documents a bit of it
<Marex> ZirconiumX: jeeze, it almost looks like rqou figured the maxV out
<Marex> ZirconiumX: there's even packer, which might be the bitstream assembler if I understand it correctly
<Marex> although, I am probably wrong, there some to be some stuff missing
<Marex> I should study that a bit more
azonenberg_work has joined ##openfpga
ym has joined ##openfpga
azonenberg_work has quit [Ping timeout: 245 seconds]
Sellerie has quit [Quit: Ping timeout (120 seconds)]
Sellerie has joined ##openfpga
pie_ has joined ##openfpga
pie_[bnc] has quit [Quit: No Ping reply in 180 seconds.]
pie_[bnc] has joined ##openfpga
pie_[bnc] has joined ##openfpga
pie_ has quit [Quit: pie_]
miek has quit [Ping timeout: 252 seconds]
<pie_[bnc]> have you guys seen this
<pie_[bnc]> wrong link
<pie_[bnc]> azonenberg: should send this guy your checklist :P
Asu has quit [Quit: Konversation terminated!]
miek has joined ##openfpga
<mumptai> hmm, the author took phased array literal, and has no control over the amplitude at all. no surprise that he side lobes look ugly
<pie_[bnc]> mumptai: im guessing thats the beginner version
<pie_[bnc]> this is like, how does babby phased array, I think
<pie_[bnc]> @ the talk
Bike has joined ##openfpga
mumptai has quit [Remote host closed the connection]