futarisIRCcloud has quit [Quit: Connection closed for inactivity]
gsi__ is now known as gsi_
ZombieChicken has quit [Ping timeout: 256 seconds]
ZombieChicken has joined ##openfpga
futarisIRCcloud has joined ##openfpga
Jybz has joined ##openfpga
emeb_mac has quit [Ping timeout: 252 seconds]
Jybz has quit [Ping timeout: 264 seconds]
Jybz has joined ##openfpga
Dolu1990 has joined ##openfpga
GuzTech has joined ##openfpga
jevinski_ has joined ##openfpga
jevinskie has quit [Ping timeout: 248 seconds]
rohitksingh has joined ##openfpga
<Sprite_tm>
Hey all, is there an example for instancincing an MULT18X18D in an ECP5 somewhere? Preferably in the form of a 32x32bit multiplication example :3
<daveshah>
Sprite_tm: this is a small 16x16 example
<daveshah>
I realise there is no timing data for the DSPs yet, so you might have to be a bit careful with frequency
<Sprite_tm>
daveshah: Eh, she'll be fine :P I can register the in- and outputs; I don't really care *that* much about speed, I just don't want to spend 20% of my FPGA on two multipliers :)
Jybz has quit [Quit: Konversation terminated!]
Jybz has joined ##openfpga
Jybz has quit [Client Quit]
Jybz has joined ##openfpga
Jybz has quit [Quit: Konversation terminated!]
Jybz has joined ##openfpga
flea86 has joined ##openfpga
Asu has joined ##openfpga
s_frit has quit [Remote host closed the connection]
s_frit has joined ##openfpga
s_frit has quit [Remote host closed the connection]
s_frit has joined ##openfpga
ym has quit [Remote host closed the connection]
rohitksingh has quit [Ping timeout: 268 seconds]
rohitksingh has joined ##openfpga
Dolu1990 has quit [Quit: Leaving]
Dolu has joined ##openfpga
rohitksingh has quit [Ping timeout: 246 seconds]
rohitksingh has joined ##openfpga
Dolu has quit [Quit: Leaving]
Dolu1990 has joined ##openfpga
rohitksingh has quit [Ping timeout: 244 seconds]
X-Scale has quit [Ping timeout: 246 seconds]
X-Scale has joined ##openfpga
lutsabound has joined ##openfpga
s_frit has quit [Remote host closed the connection]
s_frit has quit [Remote host closed the connection]
s_frit has joined ##openfpga
cr1901_modern1 has joined ##openfpga
cr1901_modern has quit [Ping timeout: 250 seconds]
cr1901_modern has joined ##openfpga
cr1901_modern1 has quit [Ping timeout: 246 seconds]
<daveshah>
sorear: about half the 45k's slices (would be significantly less with DSP inference), and 70% of its block RAM (76 18kbit blocks)
<daveshah>
Runs at 75MHz, Fmax is 80-85MHz at present (also seems rock solid at 100MHz on my board, DDR controller seems to break at 125MHz)
cr1901_modern1 has joined ##openfpga
OmniMancer has quit [Quit: Leaving.]
cr1901_modern has quit [Ping timeout: 255 seconds]
<Sprite_tm>
daveshah: If you have some time, can you perhaps help me? I'm having essentially the same issue as https://github.com/YosysHQ/nextpnr/issues/208, but (probably) with the alu slices. How can I figure out which things are the offenders?
<daveshah>
Sprite_tm: The ALU slices aren't really supported
<Sprite_tm>
Aw :/ that means I need to come up with a 32bit multiplyer myself.
<daveshah>
In theory they might work with careful manual placement, but as the current packer doesn't consider the direct connections to multipliers (which are required for the ALUs to have any useful function) they certainly won't be correctly automatically placed
<Sprite_tm>
Hmm, that sounds like too much work... if my excuse to dive deeper into FPGA fabric stuff ever materializes, I may look into that.
<daveshah>
If you are curious what the problem net is, running with --debug --verbose should give much more insight into the routing
<Sprite_tm>
Gotcha. It sounds like the chance that just removing the offending nets clears up the issue is nil, though, so I won't bother (unless it's useful to you).
<daveshah>
Yeah, I'll be the first to admit that DSPs are very much the weak link in the ECP5 flow at the moment
<Sprite_tm>
No worries, I'm already 110% happy the 'normal' fabric works like a charm.
<Sprite_tm>
Would be nice to get that RiscV multiply instruction done by the hw multipliers, though.
<Sprite_tm>
On that matter, thanks for your work on the Trellis and nextpnr stuff. If you ever happen to be in Shanghai, hit me up and I'll buy you a beer or two :)
<Adluc>
yo guys, just from curiosity, whats approximately improvement in used LUTs when using homebrew tools like nextpnr instead of original from lattice?
emeb has joined ##openfpga
<daveshah>
Right now it's often a step backwards by 10-20%, but hopefully that will improve
<daveshah>
otoh, the open source tools can often be significantly faster
<daveshah>
*faster runtime
jevinski_ has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]
<sorear>
daveshah: nice (half of 45k & 100mhz)
<daveshah>
Much better than Rocket...
flea86 has quit [Quit: Goodbye and thanks for all the dirty sand ;-)]
lutsabound has quit [Quit: Connection closed for inactivity]
lutsabound has joined ##openfpga
<sorear>
that’s what Rocket gets… on the vc707
<somlo>
daveshah, sorear: right now, 64bit rocket (without fpu) with LiteX uses about 96% of the 45k ecp5, and (sometimes) passes timing at 50MHz
<daveshah>
Oh, 50MHz is better than expected
<somlo>
of course, I haven't added the plic handler code yet, so it doesn't actually *work* yet
<daveshah>
which ECP5?
<somlo>
5g 45k (versa board)
<Dolu1990>
I will check where are the critical paths and the core logic usage
<Dolu1990>
Not sure how fast VexRiscv can go on ECP5. How ECP5 compare to, let's say, a CycloneIV ? or an Artix ?
<daveshah>
It's quite a bit slower than Artix
<daveshah>
Not sure about Cyclone IV
<Adluc>
whats Rocket?
<daveshah>
Rocket is a 64-bit RISC-V processor
<Adluc>
found repo, thx
<sorear>
I’m also wondering whether there was any significant worsening of utilization for the new MMU
<daveshah>
No, no significant change
<daveshah>
a few hundred LUTs max
<daveshah>
Dolu1990: critical path at the moment is from SEIE, through interrupt code stuff, jump interface and to the IBus pc
<Dolu>
daveshah, thanks ! Yes you are right. I will check out if there is a way to cut some combinatorial links there. But without any design changes, maybe removing the static branch prediction will relax the next PC calculation, as it will remove a 32 bits adder -> branch interface from the decode stage, --prediction none
<daveshah>
Dolu: --prediction none gets up to 98.9MHz
<daveshah>
I'm sure it will make 100 with another seed...
<Dolu>
^^
<adamgreig>
does it have a built in mode for my current "up, delete last seed, increment by one, retry, check if we met timing" thing? :P
<adamgreig>
better yet, run 16 seeds in parallel...
emeb_mac has joined ##openfpga
rohitksingh has quit [Ping timeout: 246 seconds]
zachjs has joined ##openfpga
<Dolu>
So it look it is the DBus MMU translation, which has a whole cycle in the memory stage, to go from virtual to physical address. For this one, i would say, one option could be added to relax the input virtual address register.
<Dolu>
It would be easy to relax it with a dedicated register to go from the 32 bits adder from execute stage to memory stage
Asu has quit [Ping timeout: 252 seconds]
Asu` has joined ##openfpga
zachjs has quit [Remote host closed the connection]
zachjs has joined ##openfpga
<zachjs>
I've been working on a research project for converting SystemVerilog to Verilog and heard that people here might be interested. If you're interested, the repo is at https://github.com/zachjs/sv2v
<zachjs>
mithro: I saw that project a bit ago, but found that it was limited in scope. For example, it doesn't appear to work on interfaces. Also, I found that verilator (which it relies on) has some odd limitations, for example, with the sizing of members in struct literals.
<zachjs>
The output also isn't fully Verilog-2005 compliant, still producing logics.
<zachjs>
hackerfoo: Thanks for the link, I'll check it out!
<sorear>
difficult for me to get enthusiastic over something that will add more build steps and more opportunities to have error messages with zero correlation to the source
wpwrak has quit [Read error: Connection reset by peer]
ZombieChicken has quit [Remote host closed the connection]
ZombieChicken has joined ##openfpga
GuzTech has quit [Remote host closed the connection]
<zachjs>
sorear: That's a great point. The larger goal of the project is to enable the synthesis of SystemVerilog without using commercial tools, rather than to completely cut out commercial tools from the development process.
wpwrak has joined ##openfpga
<zachjs>
The Verilog specification has a preprocessor directive which can indicate what source file/line generated output came from. yosys already supports that directive. I'd be very interested to hear how outputting such directives, or maybe some other methods, might make debugging the output of the tool easier.
<whitequark>
that directive is not particularly useful
<whitequark>
sure, it gives you the rough indication, but you still have to read generated source
Dolu has quit [Quit: Leaving]
mumptai_ has quit [Quit: Verlassend]
Dolu1990 has joined ##openfpga
<zachjs>
whitequark: I'd be eager to try out anything that could make that process easier. The output isn't too rough, though I'm sure it could be refined. We were able to get a RISC-V core converted and booted without much issue.
<whitequark>
zachjs: you could use the (* loc *) attributes
<whitequark>
in principle that would be as good as using yosys' verilog or sv frontend
<zachjs>
whitequark: Thanks for the heads up! I'll be sure to look into that.
knielsen has quit [Ping timeout: 250 seconds]
Laksen has quit [Quit: Leaving]
Dolu1990 has quit [Quit: Leaving]
lutsabound has quit [Quit: Connection closed for inactivity]