<Vinalon>
Thanks from me too! It's really nice to have ready-made bus implementations
<Vinalon>
so what's the eventual goal of nmigen-soc? Will it aim to implement something like MiSoc or Litex?
<whitequark>
yes
<Vinalon>
neat - would there be any interest in adding an Element subclass which allows fine-grained read/set/clear access for different bitfields?
<Vinalon>
I implemented one for the RISC-V CSRs, which sometimes need read-/set-/clear-only permissions for different fields.
<Vinalon>
but it might not be very efficient (it needs four extra values for each register's masks) and I'm not sure if that sort of functionality would be worth trying to integrate
<whitequark>
there's already a pending PR that adds this functionality
<Vinalon>
oh, cool - too bad I didn't wait a few days :)
<Vinalon>
thanks again then, jfng
<whitequark>
that PR has been pending for months
<whitequark>
and it's not from jfng
<Vinalon>
oh...I was looking at #11. My bad
<Vinalon>
which PR is it? It'd be nice to see a reference implementation since I know so little about best practices
<whitequark>
it's not really a reference implementation, an intern at m-labs did it. #2
<Vinalon>
oh, yeah I see it now; there's a 'csr' branch in their fork. Fun.
<Vinalon>
does anyone ever get 's_push: parser stack overflow' MemoryErrors when simulating a design, or is that likely a problem with my design?
<Vinalon>
some searching indicates it's probably caused by very deeply-nested expressions which I guess are probably in the generated code
Degi has quit [Ping timeout: 256 seconds]
Degi has joined #nmigen
<Vinalon>
It seems to be happen when I add the ~80th Element to a Multiplexer. Splitting things into two Multiplexers works.
kc5tja has joined #nmigen
<kc5tja>
Hello folks; I'm trying to create a bitstream for an FPGA for the first time (specifically, TinyFPGA BX board). However, I'm at a complete and total loss trying to figure out how to do it. Can someone provide a link to a project which instantiates a project that is more sophisticated than the blinky demo?
<kc5tja>
I specifically need to bind TinyFPGA pins to a Z80 bus interface. I have the bus interface module written up; but I can't figure out how to bind the signals to actual device pins.
<kc5tja>
Thanks.
<kc5tja>
(re: for the first time -- I meant to say for the first time using nmigen. I've used Xilinx Webpack ISE in the past, so have prior FPGA experience, but nmigen is totally new to me.)
<Vinalon>
Good news! There's a board file for the TinyFPGA BX in the 'nmigen-boards' repository, so it'll be pretty easy: https://github.com/nmigen/nmigen-boards
<kc5tja>
I'm already using it. It is not easy.
<Vinalon>
Oh, sorry. So you've already got a basic project building and running on the board? What particularly are you trying to do?
<kc5tja>
I have a blank chip. I haven't gotten far enough yet to send a bitstream to it.
<kc5tja>
No worries; I don't know what info to provide when asking questions, so I apologize if I'm being vague.
<kc5tja>
Looking. Thanks!
<Vinalon>
you can use 'platform.request(...)' to get a reference to one of the 'Resource' objects defined in the TinyFPGA 'nmigen-boards' file.
<Vinalon>
But I'm still trying to figure out how to get a reference to an I/O pin from the 'Connector' resources in those files...
<kc5tja>
I'm wondering if the intent is to subclass the TinyFPGABXPlatform class and create our own resources in the subclass?
<Vinalon>
I don't know if that's the intent, but you can do that - the 'versa_ecp5_5g' board file makes itself a subclass of the 'versa_ecp5' board, for example
<Vinalon>
it seems like if you wanted to assign functions to different pins, you could make a subclass and define some extra Resource objects with the pins you want to use
<kc5tja>
HAH, that upduino example is what I'm converging towards. :)
<Vinalon>
Well, I think you should be able to use normal combinatorial rules to link your I/O signals to the 'i' and 'o' attributes of individual pin resources, but I'm also curious about the right way to use the 'Connector' classes in the generic board files. Maybe someone else can say
<kc5tja>
Yeah, I figured there'd be an equivalent to request() for connectors, but I couldn't find anything resembling that logic.
<Vinalon>
Yeah...I think that logic is probably related to the 'ResrouceManager' class, but I haven't figured it out.
<Vinalon>
you might get better answers tomorrow when some more experienced people show up - good luck, it's always nice to hear about open-source retrocomputing projects!
<kc5tja>
Yeah. In my case, neo-retrocomputing. ;) Although the VDC-II core is intended to (more or less) clone the 8563/8568 VDC, I also intend on enhancing it with new features for use with my homebrew RISC-V-based computer as well.
<kc5tja>
But, first, get it working in a known-good computer. ;)
<kc5tja>
Sweet -- it built!!
<kc5tja>
I think I'll commit this to the repo for tonight, and turn in. I need to build the rest of the 5V and level-shifting logic before I can test the FPGA in-circuit.
<kc5tja>
Vinalon: Thanks for the help! It allowed me to make good progress tonight.
<Vinalon>
Hooray, congrats! Sorry I couldn't be more help with figuring out the Connector logic
<awygle>
jfng: does the wishbone bus implementation in nmigen-soc implement the classic or the pipelined version? or both?
<jfng>
both
<jfng>
you'll need to use `features={"stall"}` for that
<awygle>
ah ok, thanks!
<kc5tja>
Actually, i'm not sure why I didn't think to ask here before, but this is the perfect place.
<kc5tja>
I need to create a 25.145MHz clock for my TinyFPGA BX project; however, this board only has a 16MHz oscillator on-board. I need to instantiate a PLL primitive. Anyone know how to do this from within nMigen? Otherwise, I'll need to figure out how to do this via Verilog, and would prefer to avoid Verilog if I can.
<whitequark>
you do it in nMigen in about the same way as in Verilog conceptually, I have some example code
<whitequark>
it's a bit more complex because it integrates the functionality of the icepll utility
<kc5tja>
That looks to be exactly what I need. Thanks!!
<Sarayan>
Hey wq, did you find out what I fucked up in the via6522 sim code?
<whitequark>
I have no recollection of you asking me to do that, even
<Sarayan>
Oh, I didn't really ask :-)
rohitksingh has joined #nmigen
<whitequark>
do you have an MCVE?
<Sarayan>
But I know you looked, given your remark about yield from
<Sarayan>
if you didn't look more than that, no problem
<Vinalon>
Would it be unusual for a build to take on the order of hours on a slow machine, or is that fairly typical for larger designs?
<whitequark>
define larger
XgF has quit [Ping timeout: 265 seconds]
XgF has joined #nmigen
<Vinalon>
Well, it hasn't given any errors building for an iCE40UP5K with ~5K luts, but it's sort of hard for me to estimate since I don't have a gate count and I might have made some poor design decisions along the way. It's ~2500 lines of code, but that's a pretty poor metric
<Vinalon>
I guess I can leave it for another few hours and ask again when I have a result or error; I was just curious if long builds were normal
<whitequark>
absolutely nothing for UP5K should build for hours
<Vinalon>
huh. Can the toolchain get stuck trying to optimize a design if it won't fit into a chip?
<whitequark>
it's possible
<Vinalon>
okay, thanks. Is there a way to get the build process to print more verbose information about its progress?
<ZirconiumX>
Are you using Diamond?
<Vinalon>
nope, icestorm
<ZirconiumX>
Then in the build folder there will be a log file from nextpnr I believe
<ZirconiumX>
Which you can tail -f
<Vinalon>
that's sort of the thing...it hasn't actually made a build folder yet
<Vinalon>
but it's been very busy for ~75 minutes
<Vinalon>
and it simulates fine
<ZirconiumX>
Does `ps` show nextpnr running? WQ's right; nothing should take this long
<whitequark>
uh, then that seems like a bug in nmigen
<ZirconiumX>
Or Yosys?
<whitequark>
could also be
<whitequark>
wait, no
<Vinalon>
no, just python
<ZirconiumX>
Then it's nMigen.
<ZirconiumX>
And I've had infinite loop bugs in Yosys before
jyrdjyrf has joined #nmigen
jyrdjyrf has quit [Remote host closed the connection]
<Vinalon>
c'est la vie...then I guess the 'ast.py' methods are probably where I should start poking around looking for an endless loop?
<whitequark>
nope
<whitequark>
you hit ^C and look at the backtrace
<Vinalon>
it looks like there are some nested 'LegalizeValue' exceptions, like:
<whitequark>
you're indexing the memory using code like `mem[addr]`
<whitequark>
that syntax works only in simulation
<whitequark>
you need to instantiate read and write ports using the corresponding methods on the memory
<whitequark>
please file a bug about this, since it'll likely be a common pitfall
<Vinalon>
ooooooohkay. Is the 'examples/basic/mem.py' a good reference for that?
<Vinalon>
sure, will do
<whitequark>
yup
<Vinalon>
thanks for the help!
<ZirconiumX>
wq: I wrote a small patch to name wires according to their source locations (as an alternative to (* src *) attrs), but annoyingly this seems to mostly confuse the synthesis tools into thinking they're properly named.
<ZirconiumX>
Maybe they need to start with $ or something
<ZirconiumX>
Was wondering if you'd be interested in a patch like that
<whitequark>
ZirconiumX: I already added support for this in yosys
<whitequark>
`rename -src`
<kc5tja>
Vinalon: Oooh....just curious, how easy would it be to adapt your core to RV64I?
<ZirconiumX>
kc5tja: I wouldn't try to run a 64-bit processor on an iCE40
<kc5tja>
ZirconiumX: My current core (Verilog) barely fits, which is all I need.
<ZirconiumX>
You could consider Minerva
<ZirconiumX>
That's nMigen and RV32I
<kc5tja>
But I want 64-bit. My code all exists for 64-bit.
<kc5tja>
And I'm not about to retool for 32-bit.
<ZirconiumX>
... What does 64-bit get you here?
<kc5tja>
Personal preference.
Vinalon has quit [Read error: Connection reset by peer]
<ZirconiumX>
But FPGAs hate 64-bit logic
* kc5tja
sighs
Vinalon has joined #nmigen
<whitequark>
a 64-bit adder on ice40 will run at what, 20 MHz? optimistically?
<kc5tja>
Never mind. Forget that I asked. Clearly, I'm doing something very wrong, even though my Kestrel-2DX is proof that it works already.
<ZirconiumX>
Wonder if Yosys emits a $lcu for that
<kc5tja>
whitequark: Good to know. I'll break it up into 4 16-bit operations if I must.
<whitequark>
yup, definitely possible to make it fast
<whitequark>
I can think of a few uses for a 64-bit core on ice40, though it is definitely harder to implement than a 32-bit one
<kc5tja>
My longer-term plan was to target the ECP5 for the CPU, and use iCE40 for the peripherals (vis-avis, my VDC-II core being one of them).
<ZirconiumX>
I think "RV64 with a 32-bit ALU" is doable.
<kc5tja>
ZirconiumX: That'll work for my needs.
<ZirconiumX>
I suspect the Cyclone V would actually handle that kind of thing well.
<Vinalon>
It probably wouldn't be too hard to make the signals and buses wider, but I'm still not sure about how well any of this might perform - I'm not experienced in digital logic design
<kc5tja>
Basically, I'm planning on creating a set of plug-in cards for RC2014 backplane. I/O to start with, and CPU as the final card design.
<sorear>
i mean how fast does it need to be
<Degi>
Hm a 30 bit up counter on a ECP5 runs at 800+ MHz on the 5G-8 grade, despite only being rated for 400, maybe the same can be applied to the iCE40 too?
<ZirconiumX>
Counters are fairly different things to adders
<kc5tja>
sorear: I prefer 25MHz, because with that clock rate and a 4-beat transfer over a 16-bit bus, I should be able to match a Commodore 64 in perceived performance when running 640x480 bitmapped GUI applications.
<Vinalon>
why do FPGAs hate 64-bit logic, out of curiosity? Do they have width limitations on the internal logic??
<kc5tja>
(In practice, it's actually a bit faster than that.)
<ZirconiumX>
Carry propagation delay
<daveshah>
That's not that bad tbh
<kc5tja>
Vinalon: Carry logic for adders.
<ZirconiumX>
Bit N+1 depends on bit N
<ZirconiumX>
And generally FPGAs have special logic for fast carries
<ZirconiumX>
But for iCE40 I'm pretty sure it boils down to glorified fast ripple carry.
<daveshah>
I remember trying a picorv32 with a roughly 64 bit ALU and it dropped from about 125MHz to 100MHz on ECP5
<daveshah>
The difference may not be as bad as you predict
<Vinalon>
oh...right. I think I remember something about that from the 6004.x1 edX mooc. You need to use 2^n adder circuits to implement n-bit addition or something like that?
<kc5tja>
I'm OK with a slower processor speed for personal development work; but, ultimately, when I create the CPU card kits, I'd like to hit 25MHz (if I have a 16-bit wide bus to feed it with).
<daveshah>
25MHz 64 bit should be easily doable
<daveshah>
picorv32 can manage 60MHz+ on iCE40 HX
<ZirconiumX>
I have some rather fun logic: two-stage fully-legal chess move generator
<kc5tja>
Anyway, enough yapping on personal projects. I gotta get back to work-work. Bleh. :(
<ZirconiumX>
256 bits of state in, 1024 bits out
<sorear>
if it were actually a problem you could run the bus in a separate clock domain at 4x the core clock, then you could do a 25MHz bus and _very_ leisurely core cycles
<ZirconiumX>
I mean, I'd probably target 50MHz and have the bus use clock-enables
<ZirconiumX>
nMigen code very strongly encourages you to do that kind of thing
<Vinalon>
ZirconiumX: that does sound like a fun logic puzzle - what's the output state? A set of legal moves?
<kc5tja>
To be honest, I picked 25MHz because that's the maximum safe speed for my current Verilog implementation (and, yeah, I bet the ALU is the limiting factor there; 64-bit wide barrel shifters and 64-bit adder logic) on an old Xilinx part. Architecturally, it resembles a 6502's PLA-style instruction decoder and state machine.
<kc5tja>
That, and the 4x display resolution vs user responsiveness requirement over a C64. It all aligns quite nicely.
<kc5tja>
And, as for the 16-bit path requirement, that's because most FPGA boards only sport a 16-bit path to some flavor of RAM.
<ZirconiumX>
Vinalon: kind of. It's a set of bits on the chessboard categorised by movement direction. Given a destination bit and a direction you can work backwards to find the source
<ZirconiumX>
... Actually 1024 is too high but I haven't bothered to calculate the actual output state by excluding always-zero bits
<Vinalon>
oh, it tries to figure out how a random chess state could have been arrived at? Neat
<ZirconiumX>
No, it's the legal moves of a chessboard
<ZirconiumX>
Just encoded by destination/direction
<Vinalon>
ohh, so it gives you a set of moves to choose from? Also neat
<Vinalon>
And that RC2014 project also looks like a cool use for small FPGAs - it must be hard to implement obscure hardware standards with odd timing requirements in an MCU, huh?
<ZirconiumX>
You can't really return a "list" of moves
<ZirconiumX>
So instead it returns a set of moves
<ZirconiumX>
Which is very definitely enough to cover every possible position
<ZirconiumX>
(chess programs generally allocate room for like 256 moves and 932 is way larger than even that)
<ZirconiumX>
But it takes only two cycles to output and runs at 200MHz on a Cyclone V
<ZirconiumX>
And like 75MHz on iCE40HX
<ZirconiumX>
But doesn't run at all on ECP5 because it's really demanding for routing.
<Vinalon>
is it a set of moves from a given board state, or like...every move in all of chess that it would be possible to arrive at?
<ZirconiumX>
From a given board state - that's the 256 input bits
<Vinalon>
cool - sounds like a fun parallel logic problem. How d'you rule out cases like a piece taking another piece of its own color without adding more cycles?
<ZirconiumX>
Well, there are four input boards from which everything else can be derived
<ZirconiumX>
- pawn/bishop/queen
<ZirconiumX>
- knight/bishop/king
<ZirconiumX>
- rook/queen/king
<ZirconiumX>
- black pieces
<ZirconiumX>
So, for example, the white pieces are the three piece boards and not black pieces
<ZirconiumX>
And from there you can exclude moves to a white piece
<ZirconiumX>
Through masking
<Vinalon>
interesting!
<ZirconiumX>
There's a whole wiki on how software chess programs work; with some careful thought you can apply that to chess.
<ZirconiumX>
*to hardware
<Vinalon>
cool - so are you planning to make the move-finder a submodule of a 'deep blue' FPGA design? :)
<ZirconiumX>
I read Feng-Hsiung Hsu's paper on how Deep Blue worked, aaand promptly ruled it out for a few reasons.
<ZirconiumX>
It's too sequential for my own liking; it takes potentially 8 cycles to produce a single move
<ZirconiumX>
And this move is not necessarily legal
<ZirconiumX>
So there's computation overhead wasted there
<kc5tja>
Vinalon: One of my goals is to basically take the TinyFPGA BX or EX projects and remix it, so to speak, into a generic FPGA dev board for RC2014. But, before I go building custom boards, I'm going to try my hand at integrating COTS stuff first.
<Vinalon>
Haha, it might be fun to have 'speed chess' tournaments for computers. 5 nanoseconds per move :P
<kc5tja>
So, my first project is a near-clone of the Commodore 128's VDC chip, as I'm already familiar with video generation.
<Vinalon>
that sounds cool - so you could have a bunch of identical PCBs configured to run whatever expansion cards you needed at the time?
<kc5tja>
conceptually, I suppose. :) I plan on using these to create CPU and I/O cards for my own homebrew computer parts.
<kc5tja>
Like, want a Forth-based computer instead of a RISC or CISC? 64-bit RISC-V? A 65816 processor with page-based MMU? (Yes, 65816 supports it. Nobody ever used it though.)
<_whitenotifier-3>
[nmigen] WRansohoff opened issue #340: Accessing Memory objects with '[ ]' array syntax causes confusing build errors. - https://git.io/JvHdD
<Vinalon>
nice - sounds like that'd be a great educational tool too
<kc5tja>
My VDC core project is just a proof of concept for some more ambitious goals I have. Can I understand the technology stack well enough to drive the project to completion within manageable stress levels? ;)