alyssa changed the topic of #panfrost to: Panfrost - FLOSS Mali Midgard & Bifrost - Logs https://freenode.irclog.whitequark.org/panfrost - <daniels> avoiding X is a huge feature
stikonas has quit [Remote host closed the connection]
stikonas has joined #panfrost
italov has quit [Ping timeout: 264 seconds]
urjaman has quit [Ping timeout: 260 seconds]
urjaman has joined #panfrost
icecream95 has joined #panfrost
stikonas has quit [Ping timeout: 272 seconds]
raster has quit [Quit: Gettin' stinky!]
kaspter has joined #panfrost
icecream95 has quit [Quit: leaving]
italov has joined #panfrost
icecream95 has joined #panfrost
italov has quit [Ping timeout: 272 seconds]
italov has joined #panfrost
jernej has quit [Quit: Free ZNC ~ Powered by LunarBNC: https://LunarBNC.net]
jernej has joined #panfrost
_whitelogger has joined #panfrost
camus has joined #panfrost
kaspter has quit [Ping timeout: 256 seconds]
camus is now known as kaspter
camus has joined #panfrost
italov has quit [Ping timeout: 240 seconds]
kaspter has quit [Read error: Connection reset by peer]
camus is now known as kaspter
_whitelogger has joined #panfrost
archetech has joined #panfrost
davidlt has joined #panfrost
kaspter has quit [Ping timeout: 240 seconds]
camus has joined #panfrost
camus is now known as kaspter
archetech has quit [Quit: Konversation terminated!]
_whitelogger has joined #panfrost
unoccupied has quit [Quit: WeeChat 2.8]
_whitelogger has joined #panfrost
archetech has joined #panfrost
kaspter has quit [Ping timeout: 272 seconds]
kaspter has joined #panfrost
nlhowell has joined #panfrost
chewitt has quit [Quit: Adios!]
raster has joined #panfrost
archetech has quit [Quit: Textual IRC Client: www.textualapp.com]
davidlt has quit [Ping timeout: 256 seconds]
yann has joined #panfrost
kaspter has quit [Ping timeout: 272 seconds]
kaspter has joined #panfrost
rando25892 has quit [Ping timeout: 240 seconds]
icecream95 has quit [Ping timeout: 272 seconds]
rando25892 has joined #panfrost
chewitt has joined #panfrost
italov has joined #panfrost
davidlt has joined #panfrost
stikonas has joined #panfrost
<cyrozap> Hi, all, I'm doing some ISA reverse engineering (CPU and DSP, though, not GPU), and I've been having trouble with learning some concepts because I seem to lack the vocabulary to describe what I'm trying to learn about.
<HdkR> Hello cyrozap
<cyrozap> Oh, hey, HdkR.
<HdkR> What concept are you having a hard time with?
<cyrozap> Yeah, hang on, I'm still typing, lol
<HdkR> :)
<cyrozap> So, the ISA I'm currently working on is called MD32, and it's used in a lot of MediaTek SoCs for SCP-like tasks, sensor hub, video decode controller (it's not doing DSP, just bitstream parsing and controlling the hardware decoder), and even some cellular modem tasks.
<cyrozap> Interestingly, if you search for "MD32" or "MediaDSP", there's a MIPS-inspired CPU from a Chinese university by that same name, but I've read its ISA docs and its instruction encodings don't match up at all with what I'm seeing in real binaries. MediaTek's MD32 is also not the same ISA as their PCM ISA--the instructions don't match those, either.
<macc24> !
<HdkR> ah, I see your github repo readme
<HdkR> Reminds me of how Broadcom's subdevices operate on its wifi chip
davidlt has quit [Ping timeout: 272 seconds]
<HdkR> Which those were tied to some sort of standard backplane communication method and the code could select which device to communicate at any given moment. Also had something like 6 devices hanging off of it
<cyrozap> Anyways, so in one of the big (20GB+) "Android Code" archives released by Xunlong for their Orange Pi 4G-IoT board (MT6737-based), there's a compiler and full GNU binutils suite (without any source code, unfortunately, but MediaTek most likely didn't give it to Xunlong in the first place) for this MD32 CPU.
<cyrozap> HdkR: Ah, please note that the MD32 is _also_ not the Coresonic DSP. There's a _lot_ of random ISAs in these chips.
<HdkR> Yea, I understand that :)
<macc24> mediatek chips are cursed
<cyrozap> IMO, MediaTek chips are actually really nice because of how straightforward and open to modification things are, especially when compared to Qualcomm's chips.
<HdkR> https://bcm-v4.sipsolutions.net/Backplane/index.html Found the documentation that talks about their core backplane :D
<HdkR> 7-9 cores hanging off of it. Has things like MAC and PHY shenanigans
<HdkR> So similar concept :>
<cyrozap> So, to get to my actual question, rather than disassemble the MD32 as/objdump and try to figure out the instructions that way, I've decided to just "cleverly brute-force" the disassembler, to build my own mapping of opcodes to mnemonics. And so far what I've found is, there appear to be 3 different types of instructions: Pure 32-bit instructions, pure 16-bit instructions, and sort of a "fused" 32-bit
<cyrozap> instruction, where it contains two 16-bit instructions, but the first one (in the high bytes, big-endian) can't be decoded on its own.
<cyrozap> (apologies, the second comma in the first sentence should be a colon)
<HdkR> Clever
<cyrozap> The "cleverly" part is that I'm using a combination of putting random 32-bit words into the disassembler, flipping bits in instructions that have already been decoded in order to find "adjacent" instructions, and using Z3 to pick "random" instructions that don't also match the encoding for other instructions.
<cyrozap> So far I've found like 84 different opcodes.
<cyrozap> And that's just the 32-bit operations.
camus has joined #panfrost
<cyrozap> Oh, so the actual question is, what is it called when you have "combined" instructions like that?
kaspter has quit [Ping timeout: 240 seconds]
camus is now known as kaspter
<cyrozap> I'll post an example of what I'm talking about.
<HdkR> A bundle is common in VLIW speak, but depends on who you ask. It can just be a pair, bonded instructions, or a name you think up :P
<cyrozap> Note the instructions that get decoded with two on the same line, with a pipe character separating them.
<cyrozap> HdkR: Ah, I see.
<HdkR> Common representation would be `{ \n <inst 1>, \n <inst 2> \n }`
<HdkR> Maybe with some tabs for alignment :P
<cyrozap> Of course, that then brings me to my next question: Why? Instructions seem to be aligned on 2-byte boundaries, and it already supports decoding 16-bit instructions on its own, so why have a third "bonded" instruction format? Lack of opcode space? And why encode two NOPs in the same 4-byte instruction when there's already a separate 4-byte NOP instruction?
<HdkR> could be the nops take different amounts of time, thus not really being used for a nop
<HdkR> This actually looks very similar to an ISA I also RE'd. It would do different nops based on pipeline timings
<HdkR> Also potentially something as basic as alignment and that's just want the compiler dumps out :P
<HdkR> s/want/what
<cyrozap> Alignment was my first guess, but there seems to be plenty of 4-byte instructions appearing at addresses divisible by 2 but not by 4.
<cyrozap> btw, these are the mnemonics and opcodes I've discovered so far (not final, still missing 16-bit and 16+16 instructions, and I need to re-do the "is this the same instruction" logic to take into account the argument format): https://paste.debian.net/hidden/6aff2409
rando25892 has quit [Ping timeout: 240 seconds]
rando25892 has joined #panfrost
rak-zero has quit [Quit: ZNC 1.7.5 - https://znc.in]
robertfoss has quit [Read error: Connection reset by peer]
nlhowell has quit [Ping timeout: 240 seconds]
davidlt has joined #panfrost
robertfoss has joined #panfrost
robmur01_ has joined #panfrost
robmur01 has quit [Ping timeout: 264 seconds]
archetech has joined #panfrost
<archetech> kde on ubu groovy and Arch now pretty decent
<archetech> still get a these .......
<archetech> kernel: panfrost ffe40000.gpu: gpu sched timeout, js=0, config=0x7301, status=0x58, head=0x51f54c0, tail=0x51f54c0, sched>
<archetech> Nov 24 08:04:04 alarm kernel: panfrost ffe40000.gpu: js fault
robmur01_ is now known as robmur01
alpernebbi has joined #panfrost
<alyssa> cyrozap: Mali has VLIW encodings like that too, actually.
<alyssa> For us, it's that there are different units of the hardware -- on Bifrost, a heavyweight FMA unit and a lightweight ADD unit
<alyssa> And the operations supported by each unit vary a bit. imple things like moves can run anywhere, but floating multiplies can only go on FMA, and by convention things like branches can only go on ADD.
<alyssa> Midgard is similar, but adds the twist of some of the units executing in parallel and some executing in series..
<alyssa> And some units being vector and some being scalar
<robmur01> Also sounds a bit like the quirk of the original Thumb ISA - BL (and later BLX too) actually consisted of a pair of separate instructions that were only valid to execute in sequence, but had individually-defined semantics and could be interrupted in the middle
<robmur01> it was much later with Thumb-2 that those pairs got officially redefined as single 32-bit encodings
rando25892 has quit [Ping timeout: 256 seconds]
rando25892 has joined #panfrost
<robmur01> but yeah, my guess from the shape of that code would be some kind of manual pipeline scheduling/delay slot type shenanigans
<daniels> still not as good as the quirk of an extension to the non-Thumb ISA which accidentally included an opcode called BXJ
<robmur01> daniels: hey, don't forget that Jazelle is still mandatory in Armv8 :P
<daniels> robmur01: I, er ...
<daniels> assuming that just immediately traps to the sw handler?
popolon has joined #panfrost
<robmur01> indeed it is also mandatory to *not* actually implement any opcodes :D
<robmur01> the one extension that was explicitly designed from the outset to have a limited lifespan and be phased out...
<daniels> now that's one Jazelle mandate I can get behind ...
italov has quit [Ping timeout: 256 seconds]
<alyssa> ☕
italov has joined #panfrost
kaspter has quit [Quit: kaspter]
<macc24> archetech: full dmesg?
<archetech> http://ix.io/2Fjm
<macc24> no, full dmesg
stikonas has quit [Remote host closed the connection]
stikonas has joined #panfrost
italov has quit [Ping timeout: 260 seconds]
<archetech> its enough
italov has joined #panfrost
italov has quit [Quit: Lost terminal]
<archetech> http://ix.io/2FjP ok see some more on this one
warpme_ has quit [Quit: Connection closed for inactivity]
stikonas has quit [Remote host closed the connection]
stikonas has joined #panfrost
yann has quit [Ping timeout: 264 seconds]
archetech has quit [Quit: Konversation terminated!]
kherbst has quit [Quit: duh 🐧]
archetech has joined #panfrost
* macc24 has discovered that #panfrost usbc -> headphone is over usb on duet
<macc24> (lol tab completion)
<alyssa> Hm?
<macc24> like, internally it's on usb, 17ef:7231, "Lenovo USB-C TO 3.2mm Adapter"
<macc24> or they fit a sound card into usbc plug
<alyssa> Ah. Maybe that means it'll actually work out-of-the-box then :p
* alyssa has a USB-C to 3.2mm cable for Kevin since the rockchip audio driver breaks every other kernel release
<macc24> i never noticed the breaking, it has always worked fine or my on my minnie
<alyssa> minnie is rk3288, kevin is rk3399
<macc24> yes
<macc24> i just had to sometimes unmute random stuff in pavucontrol
<macc24> so now only speakers and internal microphone don't work
yann has joined #panfrost
<alpernebbi> alyssa: sound on kevin should be fully working soon (TM)
<macc24> alpernebbi: here, have a ™: " ™ "
<alpernebbi> I've upstreamed the UCM file, but it's not released on Debian yet
<alpernebbi> and if you use pulseaudio you'll have to wait until the 14.1 release for everything to work
<alpernebbi> ™ short for thanks macc24
<alpernebbi> :D
yann has quit [Ping timeout: 272 seconds]
<macc24> by the way, how's uboot on kevin going?
<alpernebbi> still where it was :/
<alpernebbi> i don't think I can get to working on it in less than about two weeks
yann has joined #panfrost
<alyssa> alpernebbi: meanwhile here I am working on Bifrost stuff :p
<macc24> alyssa: which bifrost?
<alyssa> Bifrost bifrost.
<macc24> so, not g72?
<alyssa> Bifrost bifrost.
<macc24> hmm, so that bifrost.
<alyssa> It's all just Bifrost from my perspective.
<anarsoul> alyssa: are you implying that all bifrosts are equal? :)
<alyssa> anarsoul: For the compiler, most of the time yes
<alyssa> bbrezillon: shaders/unity/24-Tree.shader_test has the following "quirk"
<alyssa> intrinsic store_output (ssa_623, ssa_1) (3, 13, 0, 160, 162) /* base=3 */ /* wrmask=xzw */ /* component=0 */ /* src_type=float32 */ /* location=34 slots=1 *//* packed:xlv_TEXCOORD2.yz,xlv_TEXCOORD3.xy */
<alyssa> The wrmask of xzw doesn't work with how Bifrost models stores...
<alyssa> There is nir_lower_wrmask, but that would lower to two stores, which seems needlessly expensive
<alyssa> More to the point, it assumes base is per-component, instead of per-vector. So doesn't work on our hardware as-is.
<alyssa> It isn't obvious to me how the blob handles
<alyssa> Ah! We can use lower_io_to_teporaries and then fill in the holes in the writemask since the holes are undefined.
<alyssa> ok, that was a ton of thinking for a 2 line change :p
<alyssa> ok, it compiles. onwards :-)
yann|work has joined #panfrost
<alyssa> --Or not onwards. The whole gles2/gl2.1 set of shaders in my shader-db compile now.
<alyssa> and the gles3.0 set :)
<alyssa> Although probably the MRT ones are wrong
yann has quit [Ping timeout: 265 seconds]
<alyssa> ----Wait. Nope. I can't compile.
<alyssa> that ws midgard, this embarassing
<alyssa> yeah, bifrost still has some crashing
<alyssa> shaders/tesseract/229.shader_test is our next problem shader
karolherbst has joined #panfrost
stikonas has quit [Remote host closed the connection]
stikonas has joined #panfrost
<alyssa> alright, handled
<alyssa> ok, *now* gles2 shaderdb finishes
<alyssa> This is good, both because we just fixed a bunch of bugs, and also because we now have a baseline shader-db available, so when we start optimizing things we can measure against a standard set
raster has quit [Quit: Gettin' stinky!]
yann|work has quit [Read error: No route to host]
yann|work has joined #panfrost
archetech has quit [Quit: Leaving]
yann|work has quit [Read error: Connection reset by peer]
yann|work has joined #panfrost
yann|work is now known as yann
stikonas has quit [Remote host closed the connection]
stikonas has joined #panfrost
icecream95 has joined #panfrost
davidlt has quit [Ping timeout: 240 seconds]
karolherbst has quit [Remote host closed the connection]
karolherbst has joined #panfrost
raster has joined #panfrost
alpernebbi has quit [Quit: alpernebbi]
karolherbst has quit [Ping timeout: 272 seconds]
Ntemis has joined #panfrost
<Ntemis> howdy
<Ntemis> tested rk3288(miqi) mali-T764 and is not there yet
alyssa has left #panfrost [#panfrost]
<Ntemis> [ 1.606326] panfrost ffa30000.gpu: clock rate = 400000000
<Ntemis> [ 1.606423] panfrost ffa30000.gpu: [drm:panfrost_devfreq_init] *ERROR* Couldn't set OPP regulators
karolherbst has joined #panfrost
<Ntemis> clock rate on oem 4.4.x kernel is 600mhz
<Ntemis> dont know if is kernel related or panfrost related issue
<Ntemis> am on 5.9.11
raster has quit [Quit: Gettin' stinky!]