#asahi-gpu on 2021-03-18 — irc logs at freenode.irclog.whitequark.org

2021-01-11 09:46 marcan changed the topic of #asahi-gpu to: Asahi Linux: porting Linux to Apple Silicon macs | GPU / 3D graphics stack black-box RE and development (NO binary reversing) | Keep things on topic | GitHub: https://alx.sh/g | Wiki: https://alx.sh/w | Logs: https://alx.sh/l/asahi-gpu

02:53 JusticeEX has joined #asahi-gpu

03:36 phiologe has quit [Ping timeout: 244 seconds]

03:37 phiologe has joined #asahi-gpu

04:10 Baughn has quit [Ping timeout: 260 seconds]

04:11 Baughn has joined #asahi-gpu

04:38 odmir has quit [Remote host closed the connection]

04:41 linuxgemini has quit [Read error: Connection reset by peer]

04:51 Necrosporus is now known as Guest13098

04:51 Guest13098 has quit [Killed (weber.freenode.net (Nickname regained by services))]

04:51 Necrosporus has joined #asahi-gpu

05:28 <dougall> chrisf: sure - i'll try to fill it out sometime this week (in the mean time, it's really rattling around in the emulator, if that's of any use https://github.com/dougallj/applegpu/blob/30845306a9b685def1a580586dc0f08d008aee37/applegpu.py#L4042-L4220 )

07:19 * morelightning[m] < https://matrix.org/_matrix/media/r0/download/matrix.org/wVpJxlgaoXwogAMCyBqHpbxg/message.txt >

07:41 linuxgemini has joined #asahi-gpu

08:20 <jix_> morelightning[m]: FYI, if you write multi-line messages in matrix, they get bridged to IRC as a https link to a .txt file, so if you mention someone in them they might not see that

08:23 <morelightning[m]> oh, is there a better software I should use? (It's been over a decade since I've used this stuff)

08:25 <jix_> using matrix is fine IMO, it's just good to be aware of how the bridge handles multi line messages... it's suitable for posting code or logs etc, but not so much for conversations

08:26 jix_ is now known as jix

08:29 <jix> I'm using both matrix and irc, but for irc channels I usually connect directly, but that requires a persistent connection if you want to see all messages in channels that are not publicly logged (this one is though, see topic)... tastes in preferred clients vary widely and discussions about that usually lead to long off-topic discussions so I'd rather not go there

09:12 JusticeEX has quit [Ping timeout: 256 seconds]

10:10 linkmauve has joined #asahi-gpu

11:29 jobbe has quit [Quit: The Lounge - https://thelounge.chat]

11:58 jobbe has joined #asahi-gpu

12:40 jobbe has quit [Quit: The Lounge - https://thelounge.chat]

13:01 g3rp has joined #asahi-gpu

13:25 clayfreeman has joined #asahi-gpu

14:02 g3rp has quit [Quit: leaving]

14:05 JusticeEX has joined #asahi-gpu

14:09 zkrx has quit [Ping timeout: 264 seconds]

14:14 <bloom> https://xkcd.com/1782

14:23 zkrx has joined #asahi-gpu

14:25 <morelightning[m]> Is there any insight yet into bypassing the window compositor or how that connects? My initial assumption would be that metal's "present drawable" does a composite with a basic shader, and then it just passes the frame buffer along at some point... maybe thats around CoreDisplay?

14:33 <bloom> No, I very intentionally have prioritized off-screen rendering since CoreGraphics would be a massive r/e effort with 0 payoff for Linux.

14:47 <morelightning[m]> Well, what I mean is, on linux, we'll have to route the frame buffer to the display somehow - is that part of the hardware addressing identified already?

14:50 opticron has quit [Ping timeout: 240 seconds]

14:51 TheJollyRoger has quit [Ping timeout: 268 seconds]

14:51 opticron has joined #asahi-gpu

14:52 TheJollyRoger has joined #asahi-gpu

14:53 <morelightning[m]> My assumption, was that if we haven't identified those yet, that I might find them defined inside CoreDisplay or close to it.

14:56 <bloom> That's not the GPU, that's the display controller.

14:56 <bloom> marcan: ^

14:56 <sven> the frame buffer is pre-configured by iboot for us as some buffer in main memory. we'll eventually have to figure out how the display controller works, but as long as we can render to some buffer from the gpu it'll work for now

14:58 <sven> and ideally we'll figure out the display controller by observing MMIO reads/writes once marcan implements a mini-hypervisor in m1n1

15:16 <marcan> display controllers are boring, it's just a pile of mode parameters and stuff like that

15:16 <marcan> observe mmio writes, tweak things, figure out what they mean

15:17 <marcan> it's a bunch of work to write the driver but all the annoying details are going to be around things like DisplayPort link training, HDMI InfoFrames, and all that "fun" stuff

15:18 <marcan> it should be pretty much completely divorced from the GPU, and since Linux has a very mature DRM/KMS framework for handling all that stuff, which is going to be very different from how macOS does things (most likely), there isn't really any reason to try to figure out how macOS does things there

15:18 <chrisf> marcan: do we expect the apple display controller to be small like desktop, or a mobile-style behemoth?

15:18 <marcan> define mobile-style behemoth?

15:19 <chrisf> marcan: on socs designed for android the display controllers are often enormous -- tons of planes, etc

15:19 <marcan> oh, you mean compositing

15:19 <marcan> good question

15:19 <marcan> I don't really know; my gut feeling is Apple would use the GPU for that, but I guess we'll find out

15:20 <marcan> they definitely do basic scaling at least, since that's how their entire macOS DPI scaling is implemented

15:20 <marcan> (AIUI they only do a couple fixed DPI values, then scale for everything else)

15:30 <marcan> oh yeah, one thing that's going to be fun is memory bandwidth calculation stuff

15:30 <marcan> not looking forward to *that* one...

15:35 <morelightning[m]> The m1 mac window compositor is showing up as a Metal vertex/fragment shader in Instruments, just like the iPad. So I was supposing that all the information we need is just sitting defined inside the window server process.

15:35 <morelightning[m]> On the iPad, I vaguely remember something new in the last few years about needing to control the contrast range of the display - so I was assuming we'd run into that on the mac too.

15:35 <marcan> I don't think we should need *anything* about the compositor

15:35 <marcan> it's just more 3D

15:37 <marcan> OTOH, I just did a bit of `strings` recon on their display controller driver

15:39 <marcan> chrisf: looks like it's huge, but not because of compositing stuff... there's a whole mailbox protocol involved apparently, and some rather fancy stuff

15:39 <marcan> I get the feeling we can *probably* get away without a lot of it, since it's probably intended for mobile panels where there are a million variants, with weirdo OLED subpixel layouts and stuff like that

15:39 <marcan> the only OLED panel we have to deal with is the Touch Bar, but that can wait a bit

15:40 <marcan> (going to be funny extending x11/wayland screens into that; putting the Plasma taskbar on the Touch Bar, anyone? :p)

15:40 <marcan> it might support... 3 layers?

15:45 <marcan> what worries me is I think I see some firmware in here... I hope this is legacy

15:46 <marcan> there's a DCP that iBoot2 loads for us at least

15:49 <marcan> looks like DCP is probably the DP/HDMI encoder block, so I guess most of that is handled by firmware with some kind of mailbox to talk to it

15:50 <marcan> there's a whole 7MB of firmware for that, which is insane. I wonder what's in here.

15:50 <marcan> at least that probably means it does link training for us

15:59 <marcan> the DCP firmware is definitely also working with things at the framebuffer level, at least to some extent, so maybe in the end it'll boil down to a big mailbox thing?

15:59 <modwizcode> Would the plan be to extract that or were you saying that iBoot loads it for us?

15:59 <marcan> oh well, either way this is going to be fun

16:00 <marcan> modwizcode: for everything iBoot2 loads for us, they're just files next to iBoot2; all that stuff will have to be copied from the recovery partition for our "clean" installer

16:01 <modwizcode> ? not exactly clear what you mean there.

16:01 <modwizcode> It sounds like you're saying that iboot loads it but also that it doesn't?

16:01 <marcan> it does, but iBoot2 is part of the "OS" install, so we need to put it there if we're doing a from-scratch install

16:02 <marcan> along with all the firmware blobs it loads

16:02 <modwizcode> ohhhh

16:02 <modwizcode> I didn't realize where iBoot2 resides

16:03 <daniels> chrisf: if I had to guess, more complex than desktop, less complex than Qualcomm

16:03 <daniels> given their battery life figures, I have a hard time believing that they're using the GPU to composite every frame (at least when not doing fullscreen desktop)

16:04 <modwizcode> I'm surprised that more of that isn't done honestly

16:04 <marcan> they used a tiling architecture and the framebuffer is also tiled, so it would make sense for them to do a dirty tiles thing for updates, instead of compositing the entire frame

16:05 odmir has joined #asahi-gpu

16:05 <marcan> but so far I've seen mentions of 3 layers in here, so I don't think they delegate much compositing to the display controller

16:05 <modwizcode> even the OLPC people implemented a basic display controller level redraw handling thing, it's hardly cutting edge.

16:06 <modwizcode> for compositing tiles would be plenty

16:07 <sven> oh... that mailbox probably explains why the framebuffer only sometimes broke when i wrote different values to the area called "piodma" that is mapped after the framebuffer

16:07 <daniels> marcan: 3 is plenty: one for system UI, one for browser chrome, one for Netflix content

16:08 <daniels> realistically you don't need too many more if you do the HWC thing of using the GPU to flatten your static-ish content to a single buffer, then using the display controller for the more dynamic content

16:09 <marcan> makes sense

16:10 <bloom> ^ seconding what daniels said, given how optimized for power efficiency the rest of the apple soc's are, and how cheap an extra overlay plane is, this would be a really bizarre place to skimp on gates

16:10 <daniels> (not surprising that Apple actually did it properly, but it is sadly uncommon for GPU render targets and display controller fetch to share tiling formats ...)

16:11 <daniels> but yeah, enabling things like PSR is a seriously frustrating long-tail effort

16:11 <bloom> M1 has full support for tiled compressed framebuffers, documented absolutely nowhere

16:12 <bloom> Gotta hand it to Apple, even Arm and Qualcomm brag about AFBC/UBWC despite keeping the internals proprietary.

16:12 <bloom> Apple has managed to excise every mention of a compression scheme from public docs/source.

16:13 <modwizcode> ahh heh

16:13 <bloom> Admittedly I don't nknow what the display controller ingests, I just saw this in flight from app<-->compositor

16:14 <bloom> but again Apple is far too competent to not have their display controller ingest their compression scheme as well

16:14 <modwizcode> I mean if you're making both you might as well reuse your hdl c;

16:15 <marcan> bloom: there is plenty of mention of tiling/compression in these blobs

16:15 <marcan> so yes

16:16 <bloom> Better question is if the cameras support the compression scheme too

16:16 <bloom> (at least on the iPhones with fancy 4k@60..)

16:16 <marcan> ha :)

16:16 <marcan> good question

16:16 <modwizcode> don't they use off the shelf camera parts tho?

16:16 <marcan> there's a separate image processor, presumably for that stuff

16:17 <bloom> I know Arm has camera stuff feeding into AFBC but I don't know if it (or anything like it) ever shipped.

16:17 <bloom> Likewise, can the video decoder recompress to the native format on the fly?

16:17 <bloom> (IIRC Arm has this too, also not shipping afaik)

16:19 <bloom> 4k on mobile is rather painful withot compression, so

16:19 <daniels> Arm never got their media-codec IP in any shipping SoC TTBOMK, but they will license AFBC to you if you want to do that

16:19 <daniels> (I've also not seen it in the wild)

16:20 <bloom> Alas

16:20 <daniels> but given that your phone doesn't have a 4K display, rather than compressing 4K output, the secret is to just ... not have it

16:21 <daniels> keep an internal camera -> ISP -> codec pipeline which spits out a downscaled YUV/RGB frame for display, and then the codec goes straight to H.264/HEVC/whatever, ideally without touching sysmem

16:22 <daniels> (yes it's a lot of out-of-band storage, but cheaper in power than keeping DDR lit up full)

16:24 <marcan> I actually can't find the video decoder firmware, and now I wonder if it's in the SEP. because DRM.

16:26 <daniels> marcan: presumably it's a stateless impl -> very possibly no firmware?

16:26 <daniels> just a big old userspace blob

16:26 <marcan> could be

16:27 <daniels> any idea if it's in-house or licensed?

16:28 <marcan> I should start listing these engines on a wiki page somewhere. e.g. AOP is "hey siri" and other sensor stuff, PMP is power, ISP is the camera stuff, DCP is the displayport/display stuff, AVE is the encoder, ANE is neural engine

16:29 <modwizcode> There's already a glossary page, want me to make a seperate one?

16:29 <marcan> yeah, there should be a separate page for these engines

16:29 <daniels> marcan: SEP -> Secure Enclave Processor? -> ... T2?

16:30 <marcan> T2 is not a thing any more on M1

16:30 <marcan> it's integrated

16:30 <daniels> as EL3?

16:30 <marcan> no, separate CPU

16:30 <marcan> there is no EL3

16:30 <marcan> these engines are all random little side CPUs

16:30 <marcan> they do use a TrustZone carveout though, to share DRAM with the SEP

16:31 <daniels> right sorry, I misread 'integrated' as 'on-CPU' rather than 'on-die'

16:31 <marcan> basically what happened is Apple took an old iPhone, put it into a laptop, stuck an Intel CPU on the side as a coprocessor and called it a "T2 Mac"

16:32 <modwizcode> worst page ever https://github.com/AsahiLinux/docs/wiki/Accelerator-Engines but here you go

16:32 <marcan> then when they had a good enough CPU, they just beefed that up and dropped the Intel

16:32 <marcan> and that's the M1

16:35 <modwizcode> I swear I had a better place where I'd seen some of these elements defined awhile ago but I have no idea where that was

16:37 <sven> i think i read AON = always-on processor somewhere. not sure if that was in some strings or some old iphone wiki though

16:38 <modwizcode> Hell I'll throw it in in some quotes

16:39 <sven> https://machinelearning.apple.com/research/hey-siri nope, it was apple's ML blog actually :-)

16:39 <sven> "To avoid running the main processor all day just to listen for the trigger phrase, the iPhone’s Always On Processor (AOP)"

16:41 <balrog> FWIW Apple brags about their camera ISP

16:41 <bloom> daniels: lol, true

16:43 <daniels> balrog: it's strong differentiator for every SoC vendor, there's basically no commonality in ISP

16:43 <modwizcode> Okay cleaned up the page a bit

16:43 <balrog> modwizcode: also AMX

16:43 <modwizcode> Is that really a coprocessor though?

16:44 <modwizcode> I think it's coprocessor like but I thought it was part of the cores themselves?

16:44 <bloom> cocoprocessor

16:44 <modwizcode> kay

16:44 <bloom> ;P

16:44 <bloom> used for accelerating video decode of the award-winning Pixar film

16:44 <bloom> "-P

16:45 <modwizcode> "the award-winning Pixar film"

16:46 <bloom> did it not win any awards it should've :>

16:46 <modwizcode> I don't know what film we're talking about was all I meant

16:46 <bloom> https://en.wikipedia.org/wiki/Coco_(2017_film)#Accolades

16:46 <modwizcode> also isn't there built in video decoding? Why did they use AMX?

16:47 <bloom> they don't I was making a joke

16:47 <modwizcode> oh

16:47 <modwizcode> Sadly it didn't land with me

16:48 <bloom> that's normal i just say say things

16:48 <modwizcode> I mean I was mostly acknowledging that as a personal failing of myself c;

16:54 <bloom> definitely not, I'm just like this

16:58 <jn__> Coco!

17:04 jaalsa has quit [Read error: Connection reset by peer]

18:20 DarkShadow4444 has joined #asahi-gpu

18:40 AlJaMa has quit [Ping timeout: 264 seconds]

18:44 AlJaMa has joined #asahi-gpu

18:44 AlJaMa has quit [Changing host]

18:44 AlJaMa has joined #asahi-gpu

19:19 DarkShadow4444 has quit [Quit: Leaving]

19:59 g3rp has joined #asahi-gpu

20:03 taowa has quit [Quit: Gateway shutdown]

20:09 <bloom> Coco for Coco Puff!

20:19 AlJaMa has quit [Ping timeout: 264 seconds]

20:22 AlJaMa has joined #asahi-gpu

20:22 AlJaMa has quit [Changing host]

20:49 g3rp has quit [Quit: leaving]

20:50 g3rp has joined #asahi-gpu

21:14 g3rp has quit [Quit: Lost terminal]

21:26 Mary_ has quit [Read error: Connection reset by peer]

21:29 aquijoule__ has joined #asahi-gpu

21:30 aquijoule_ has quit [Remote host closed the connection]

21:30 Mary_ has joined #asahi-gpu

21:31 neobrain2 has quit [Ping timeout: 246 seconds]

21:31 neobrain has joined #asahi-gpu

21:50 crabbedhaloablut has quit [Ping timeout: 264 seconds]

21:50 crabbedhaloablut has joined #asahi-gpu

21:51 Raqbit has quit [Quit: Ping timeout (120 seconds)]

21:52 Raqbit has joined #asahi-gpu