<azonenberg> also lol trce has been going for an hour and still running
<rqou> azonenberg: the T flipflops seem to "just work" as expected
<azonenberg_work> Lol
<azonenberg_work> So output of the xor is the toggle bit? 1 = toggle?
<azonenberg_work> also, working on reworkctf atm
<azonenberg_work> waiting for the latest emulator to finish building
<azonenberg_work> its slooow
_whitelogger has joined ##openfpga
<azonenberg> wtf timing is *still* running
<azonenberg> after two hours
* azonenberg kills it
Zarutian has joined ##openfpga
Zarutian has quit [Read error: Connection reset by peer]
Zarutian has joined ##openfpga
<azonenberg> oook i am pretty confused
<azonenberg> i am feeding a 1 Hz squarewave into a pin
<azonenberg> and it's not coming out when i render the result...
<rqou> azonenberg: why does the yosys greenpak4 code use dfflibmap but none of the other techlibs do?
<azonenberg> rqou: Dont know
<rqou> did you write this?
<azonenberg> Me and clifford, i forget who did what
<azonenberg> i recall the greenpak ffs having some weird properties that made the ma little tricky to deal with
<rqou> hmm i also noticed that yosys currently can't identify TFFs
<rqou> nor can it handle DDR FFs
<azonenberg> led_0 = FB1_4
<azonenberg> led_1 = FB1_5
<azonenberg> led_2 = FB1_6
<azonenberg> led_3 = FB1_7
<azonenberg> 1 Hz clock on GCK0 (FB2_5)
<azonenberg> expected result: led_2 toggles every second
<azonenberg> so 0.5 Hz squarewave
<azonenberg> and led_3 is a 1 Hz squarewave (copy of the clock on GCK0)
pie_ has quit [Ping timeout: 276 seconds]
<azonenberg> then led_0/led_1 are constant 1
<azonenberg> Actual result: led[2:0] are correct, led_3 stays constant zero and doesn't toggle
<azonenberg> (In my emulator)
<azonenberg> Trying to walk my way through the bitstream and see if something funny is going on
<azonenberg> The hour-long PAR isn't helping :p
<rqou> hmm, bitstream looks right to me
<azonenberg> Yeah i'm trying to figure out where my emulator is screwing up
<rqou> also, yosys doesn't seem to know how to extract T flipflops
<azonenberg> Yeah you mentioned
<azonenberg> we may have to do that as a separate optimization or a new yosys pass
<rqou> oh lol i just mentioned that :P
<rqou> i know where DDR FFs need to get added, but i have no idea how to go about adding TFFs
Zarutian has quit [Quit: Zarutian]
<azonenberg> I havent figure out how to do DDR in my emulator yet
<azonenberg> the FPGAs dont support DDR in the fabric, only in IOBs
<rqou> oh yeah huh that is a weird cpld feature
* rqou goes off and hand-synthesizes an insane cpld design just to make RE harder :P
ZipCPU|Laptop has joined ##openfpga
<rqou> CPLD crackme? :P
<rqou> of course, just to match how most Vendors(TM) will implement it, the crackme outputs a single logic high/low to indicate if it's cracked :P :P
<azonenberg> lol
<azonenberg> assign cracked = 1'b1;
<azonenberg> Hmmmm
<azonenberg> Can you use an FPGA pin as an ibuf and obuf simultaneously?
<azonenberg> wondering if i can use an unused pin as a DDR register, lol
<azonenberg> just looping back to itself
<rqou> wtf
<rqou> right, this kind of trick
<rqou> anyways, time to actually get something done today :P
DocScrutinizer05 has quit [Disconnected by services]
DocScrutinizer05 has joined ##openfpga
<awygle> hey folks, new to this space, mostly just lurking to increase my understanding. had a quick question about tools. it seems like synthesis is fairly cohesive (mostly yosys) but p&r is more fragmented. even within openfpga gp4par and xbpar don't seem to share code. why is that? iiuc the underlying algorithm is basically the same, and the problems seem superficially similar
<rqou> so xbpar is a subcomponent of gp4par
<rqou> as for the algorithms, there seem to be currently two(ish) major approaches to p&r
<rqou> naive simulated annealing, and VPR
<rqou> afaik all the existing p&r tools are the first kind
<rqou> azonenberg can probably provide more info
<rqou> one big problem seems to be that a lot of the open-source p&r work either doesn't scale or has lots of assumptions about the particular target
<azonenberg> awygle: so, gp4par is the application that does par for greenpak4 specifically
<azonenberg> xbpar is an abstracted library for par of a netlist in a device with a crossbar-based (vs 2D routing fabric) interconnect
<azonenberg> including but not limited to greenpak and coolrunner
<azonenberg> rqou's coolrunner par and gp4par both use xbpar under the hood
<azonenberg> there's probably not as much generic code in xbpar as there could be, that's a TODO refactoring
<azonenberg> But they do share code
<azonenberg> I don't know much about how the VPR project's routing works
<azonenberg> The two main approaches are simluated annealing, which is what xbpar does
<azonenberg> and some kind of fancy linear algebra that i think vivado does
<azonenberg> i have ideas on how to optimize using physics simulation techniques (basically treat the timing paths as a mass-spring system) but no idea if it'll work
<balrog> VPR seems to use a smarter simulated annealing approach? I might be looking at an old and not as relevant paper though
<rqou> pointfree mentioned that a while ago
<azonenberg> Could be, i've only targeted crossbar based devices where none of that worked
<rqou> clifford came over and pointed out that the mass-spring-analogy approaches are basically how ASIC p&r works
<azonenberg> interesting
<azonenberg> that may be what vivado is doing?
<balrog> simple harmonic oscillator, shows up everywhere :P
<azonenberg> i used to work on molecular dynamics stuff
<azonenberg> that scales very well
<azonenberg> like, to hundreds of thousands of cores
<azonenberg> i've dreamed my entire FPGA career of something that you can run on a rack of xeons and PAR a giant netlist in 30 seconds
<azonenberg> So it may have potential for that
<awygle> rqou: azonenberg: thanks :) my knowledge has been increased. so i'd imagine when targeting the greenpak looking at arachne-pnr which targets a 2D fabric architecture was of limited use? i remember being surprised that yosys was used in both places but the p&r was different
<azonenberg> buy up 1024 ec2 instances, run a par job in a couple minutes
<azonenberg> Yes, arachne is of no use for greenpak
<azonenberg> i wrote my own b/c i couldnt find any good open par tools for crossbar architectures
<rqou> i seem to remember reading that arachne doesn't scale either?
<azonenberg> I'm going to look at both arachne and VPR as possible baseline tools for xilinx FPGA support down the road
<azonenberg> I know nothing about the internals of either
<azonenberg> they're both on my reading list
<awygle> azonenberg: thanks again, think i get it now
<azonenberg> rqou: what i want to try doing, when we design an FPGA par
<azonenberg> is focus on scalability
<azonenberg> and parallelism, from the beginning
<rqou> dumb verilog question: how do i model a transparent latch with async set?
<azonenberg> or at least early on
<azonenberg> just an always @(*) block
<azonenberg> specify the desired behavior, when nothing writes it'll latch
<rqou> yosys somehow generates a giant mess when I do that
<azonenberg> Yosys has poor support for latches last time i checked
<azonenberg> i have support for instantiating them in gp4par but i havent tried inferring
<azonenberg> generally, it doesnt handle them well
<azonenberg> I have a long list of grievances with yosys that i want to fix
<rqou> ah so this isn't a problem on my end specifically
<azonenberg> my #1 priority is preserving instance names better through ABC and various optimization passes
<azonenberg> so you can figure out what primitive in the generated netlist maps to what hdl object
<balrog> why does everyone say "don't use latches in FPGA/CPLD design"?
<rqou> timing analysis doesn't work, for one thing
<balrog> ah, feedback
<azonenberg> async stuff in general is a huge pain in the butt
<azonenberg> sync is way easier to analyze
<rqou> oh wtf yosys can't actually infer $_DLATCHSR_*
<rqou> even though the cell library has them
<azonenberg> Yeah, lol
<azonenberg> I told you
<azonenberg> it just uses combinatorial loops
<rqou> there's also no dlatchlibmap
<azonenberg> But since i use latches so rarely it isnt a big priority
<azonenberg> i can instantiate primitives on the rare occasions they're needed
<azonenberg> but generally a latch is a bug in what's supposed to be stateless combinatorial logic
<rqou> but the cpld latches have bonus fun with the async set/reset :P
<whitequark> when ever are latches appropriate?
<azonenberg> whitequark: ultra low power designs in greenpak when you don't have a clock and don't want constant dynamic power
<rqou> when you are my father working on a design decades ago and were manually stealing time from other pipeline stages? :P
<cr1901_modern> constant dynamic power?
<azonenberg> the async state machine block is in fact specifically designed for this
<azonenberg> cr1901_modern: as in, the clock toggling uses power even if nothign is happening
<azonenberg> if you dont have a clock, things only use dynamic power when an input changes
<rqou> what about pipeline stage timing stealing/shifting hacks? :P
<cr1901_modern> Oh right. I guess the clock transition would use _some_ power even if input didn't change
<azonenberg> Yeah exactly
<azonenberg> and with a device that has standby power in the hundreds of nA
<rqou> i thought the clock tree uses most of the power? not the useless toggling?
<azonenberg> it makes a difference
<azonenberg> rqou: even having the oscillator enabled uses power
<azonenberg> greenpak standby @ 3.3V, typical process/temp: 370 nA
<rqou> oh wtf
<rqou> wow
<azonenberg> LF oscillator (1.7 kHz): 890 nA
<azonenberg> RC oscillator, 25 kHz : 6020 nA
<azonenberg> that's before you add any loads to the osc output
<azonenberg> They do not publish dynamic power for FF/LUT toggling etc
<azonenberg> that's something i want to try measuring
<rqou> hmm i just noticed my techmapping has a problem
<rqou> abc -sop is a giant hammer that applies to _everything_
<rqou> not just "stuff feeding into macrocells"
<azonenberg> lol
<azonenberg> yes, abc is a hammer
<rqou> it also eats all your cell names :P
<rqou> i think fixing this entails the same work that can probably make the xor gate work
<azonenberg> Yes
<azonenberg> Re eating cell names
<azonenberg> I dont plan to patch ABC
<azonenberg> but i think if i know the nets going in and out
<azonenberg> i can assign names that make some degree of sense
<azonenberg> at least so you can tell what line or two of rtl it's related to
<cr1901_modern> "stealing time from other pipeline stages" can you elaborate (bed time for me I think)?
<rqou> so the magic search term is "time borrowing"
<rqou> but basically for a fully synchronous design, you can only run as fast as the _slowest_ pipeline stage
<rqou> but if you have a really slow pipeline stage next to a really fast pipeline stage...
<rqou> if you replace the FF between them with a latch, this allows the "slow" stage to use some of the time from the "fast" stage
<rqou> because as long as the clock is still high, the output from the "slow" stage will still propagate through the latch (rather than missing the edge and getting blocked with a FF)
<azonenberg> Better option: re-time the registers :p
<azonenberg> push them a few gates later in the path
<rqou> that works too :P
<rqou> azonenberg: something is wrong with the xilinx LDCP primitive documentation
<azonenberg> (and works with timing analysis)
<azonenberg> oh?
<rqou> look at the second and third rows of the logic table
<openfpga-github> [openfpga] azonenberg pushed 2 new commits to master: https://git.io/vQ36x
<openfpga-github> openfpga/master b3f3b55 Andrew Zonenberg: Added configurable-edge flipflop for use in the macrocell. Supports async set/reset and rising/falling/DDR edges. No latch support yet.
<openfpga-github> openfpga/master bae7c52 Andrew Zonenberg: Imported synchronizer cores from Antikernel IP cores repo
<azonenberg> rqou: and what's wrong with them?
<rqou> the second row has an X in the PRE input
<rqou> so it overlaps the third row
<azonenberg> Hmm
<azonenberg> Yeah that does seem wrong i think that was probably meant to be a 1
<azonenberg> or...
<azonenberg> hmm
<azonenberg> PRE is lower precedence than G
<rqou> yeah, that too
<rqou> this also means that it's not equivalent to $_DLATCHSR_PPP_
<rqou> $_DLATCHSR_PPP_ works the "9500" way
<rqou> where the precedence is reset, set, latch
<rqou> but the text says that xc2 is reset, latch, set
<rqou> fun
<rqou> i don't actually know how to teach yosys about this
<rqou> screw it for now i guess?
<azonenberg> Lol
<azonenberg> yeah leave it out for now
<azonenberg> but try to issue a warning if they're used
<rqou> oh yosys can't even infer $_DLATCHSR_* so that can't happen unless someone is doing something really weird
<rqou> heh i just realized that my par engine was completely missing edges into BUFGs
<azonenberg> how do you do that? lol
<rqou> i just never implemented this particular function because registers weren't implemented
<azonenberg> Any time gp4par tried to route a path that didnt exist it'd just fail b/c it couldn't find an edge to map it to
<rqou> yeah so it fails right now too
<rqou> or at least it should
<rqou> i don't think i'm going to actually add it yet because the code needs a huge refactor
DocScrutinizer05 has quit [Disconnected by services]
DocScrutinizer05 has joined ##openfpga
<rqou> hmm i wonder if inverting the clock on a DDR FF has any observable effect?
<azonenberg> Don't think so
<rqou> it shouldn't, but do we know for sure? :P
<azonenberg> Nope
<azonenberg> :p
<rqou> hmm why does my yosys netlist have extraneous BUF cells?
<azonenberg> no ida
<azonenberg> no idea*
<azonenberg> also, i think i just found a bug in my ZIA docs
<azonenberg> oh nvm its not the zia
<azonenberg> i see what i did
* azonenberg pokes a bit
<azonenberg> Soooo ibuf_to_zia[20] should be a 1 Hz squarewave...
* azonenberg waits for build to confirm
<rqou> FTDCP is my favorite cpld primitive :P
<azonenberg> Why?
<azonenberg> Ok, so...
<rqou> dual-edge TFF
<azonenberg> lolwut
<azonenberg> so it tracks the incoming clock
<azonenberg> anyway, ibuf_to_zia[20] is a 1 Hz squarewave
<azonenberg> right_zia_out[6] is 1'b0
<azonenberg> and the bitstream in my JED for row 6 is L000048 01111011*
<azonenberg> This smells wrong
<azonenberg> zia_row_inputs[6][2] is zbus[21], which should be ibuf_in[20]
eduardo_ has quit [Ping timeout: 255 seconds]
eduardo_ has joined ##openfpga
<azonenberg> Welllp
<azonenberg> the ZIA bitstream is wrong
<azonenberg> now to figure out how the fsck THAT happened...
<azonenberg> right_zia_config[6*8 +: 8] == 8'b01111011
<azonenberg> how the...
<azonenberg> this doesnt make sense
<cr1901_modern> Most things in life don't (so much for sleeping)
<azonenberg> what the actual f...
<azonenberg> how did this ever work
<azonenberg> apparently the ZIA bitstream was getting MIRRORED
<azonenberg> left to right
<azonenberg> um...
<rqou> depending on how you arranged the bits, yeah that happens
<azonenberg> no i mean
<azonenberg> how did all of my earlier tests work?
<azonenberg> :p
<rqou> somehow the ise algorithm likes to pick 8'b01111110
<azonenberg> lol
<azonenberg> looool
<rqou> (i had this problem too :P )
<azonenberg> so i just got lucky
<azonenberg> Welllp
<azonenberg> lol
<azonenberg> This explains a lot
* azonenberg tests with a more complex bitstream and faster clock rate
<rqou> azonenberg: we should think about how to move forward
<azonenberg> Awesome
<azonenberg> let me send him a PR for my recent fix too
<rqou> the code with the giant Rust->C++->Rust stack is kinda a mess
<rqou> also hugely un-ergonomic
<azonenberg> Yes
<azonenberg> honestly, if i had anything to say i'd rewrite it in C++
<rqou> biggest missing xbpar features right now are *) nodes that have to move as a group *) nodes that can become shared
<rqou> e.g. right now each and term uses a unique ZIA row to get its inputs
<rqou> the biggest thing i hated about C++ was how much effort "parsing stuff" took
<azonenberg> So the way i'd implement this is
<azonenberg> i'd first place all of the PLA terms
<azonenberg> actually no
<azonenberg> that wouldnt work, nvm
<rqou> sharing andterms doesn't work either :P
<azonenberg> But yeah i think all c++ is likely to be more maintainable
<azonenberg> for the short term
<rqou> all rust? :P
<azonenberg> Lol
<azonenberg> also, THIS is interesting...
<rqou> the FFI shit is actually more lines of code than the entire PAREngine
* azonenberg looks
<azonenberg> so i have a counter working now
<azonenberg> but something is seriously borked with the ordering
<azonenberg> i'm not even sure how to explain it
<azonenberg> Yeah i def still have bugs
<rqou> heh xc2par is less than 2k lines of code
<azonenberg> ok yeah i have bugs
<azonenberg> i have a count that is supposed to go up every 50 clocks
<azonenberg> i.e. every 50 clocks i increment the 4-bit LED counter
<azonenberg> instead, led_count[0] and [2] are flashing at ~1 Hz
<azonenberg> maybe 2 Hz?
<azonenberg> and 1 and 3 are off
<azonenberg> so, buggy... just not sure why yet
<azonenberg> lol
<azonenberg> and now i have another test that keeps led[3:1] at 0 and drives led_0 every ~500ms
<azonenberg> led[3] is somehow blinking
<azonenberg> Gonna investigate that tomorrow
<openfpga-github> [openfpga] azonenberg pushed 2 new commits to master: https://git.io/vQ3QV
<openfpga-github> openfpga/master bc6d7af Andrew Zonenberg: Continued macrocell support. Still lots of known bugs
<openfpga-github> openfpga/master 40f2adf Andrew Zonenberg: Fixed incorrect ZIA bitstream ordering
<rqou> offtopic: I _just now_ realized (by looking at the map) that Tuen Mun's "giant shopping complex clusterf*ck" is actually a mixed-use area with residential space
<rqou> but i have absolutely no clue how these are connected together
<rqou> huh OSM has better data here, but it's out of date
DocScrutinizer05 has quit [Disconnected by services]
DocScrutinizer05 has joined ##openfpga
qu1j0t3 has quit [Ping timeout: 240 seconds]
pie_ has joined ##openfpga
scrts has quit [Ping timeout: 240 seconds]
scrts has joined ##openfpga
qu1j0t3 has joined ##openfpga
Shoggoth has joined ##openfpga
pie_ has quit [Ping timeout: 240 seconds]
Shoggoth has quit [Ping timeout: 240 seconds]
pie_ has joined ##openfpga
pie_ has quit [Ping timeout: 240 seconds]
[X-Scale] has joined ##openfpga
X-Scale has quit [Ping timeout: 240 seconds]
[X-Scale] is now known as X-Scale
<balrog> rqou: that's cute
<balrog> :P
Zarutian has joined ##openfpga
azonenberg_work has quit [Ping timeout: 276 seconds]
<pointfree> rqou: When can we get a yosys par for antennas :)
<pointfree> Actually, these kinds of oddball applications are the kinds of things that can benefit from free/libre tools.
amclain has joined ##openfpga
Zarutian has quit [Quit: Zarutian]
jayaura has joined ##openfpga
Hootch has joined ##openfpga
m_w has joined ##openfpga
scrts has quit [Ping timeout: 268 seconds]
scrts has joined ##openfpga
digshadow has joined ##openfpga
azonenberg_work has joined ##openfpga
mifune has joined ##openfpga
mifune has joined ##openfpga
pie_ has joined ##openfpga
Hootch has quit [Ping timeout: 246 seconds]
m_w has quit [Quit: leaving]
m_w has joined ##openfpga
m_w has quit [Quit: leaving]
pie_ has quit [Quit: Leaving]
pie_ has joined ##openfpga
pie_ has quit [Remote host closed the connection]
pie_ has joined ##openfpga
mifune has quit [Ping timeout: 276 seconds]
m_w has joined ##openfpga
pie_ has quit [Changing host]
pie_ has joined ##openfpga
pie_ has quit [Ping timeout: 268 seconds]
Zarutian has joined ##openfpga
pie_ has joined ##openfpga
awygle has quit [Ping timeout: 260 seconds]
awygle has joined ##openfpga
Shoggoth has joined ##openfpga
Shoggoth has quit [Quit: Shoggoth]
Shoggoth has joined ##openfpga
awygle has quit [Ping timeout: 260 seconds]
Shoggoth has quit [Ping timeout: 240 seconds]
Zarutian has quit [Quit: Zarutian]
awygle has joined ##openfpga
Shoggoth has joined ##openfpga
mifune has joined ##openfpga
Shoggoth has quit [Ping timeout: 260 seconds]
azonenberg_work has quit [Ping timeout: 260 seconds]
Shoggoth has joined ##openfpga
Shoggoth has quit [Ping timeout: 240 seconds]
Shoggoth has joined ##openfpga
pie_ has quit [Changing host]
pie_ has joined ##openfpga