#panfrost on 2019-08-20 — irc logs at freenode.irclog.whitequark.org

2019-02-15 17:52 alyssa changed the topic of #panfrost to: Panfrost - FLOSS Mali Midgard & Bifrost - https://gitlab.freedesktop.org/panfrost - Logs https://freenode.irclog.whitequark.org/panfrost - <daniels> avoiding X is a huge feature

00:36 tgall_foo has joined #panfrost

01:05 vstehle has quit [Ping timeout: 268 seconds]

01:06 robink has joined #panfrost

01:41 <alyssa> Woah woah that was something about the hardware I didn't need to learn.

01:45 <urjaman> o.O?

01:46 <alyssa> urjaman: The polygon list BO has to be larger than reported in `polygon_list_size`

01:47 <urjaman> okay good, nothing exploded :P

01:47 <alyssa> :p

01:47 <alyssa> urjaman: How goes the Chromebooking and Panfrosting these days?

01:49 <urjaman> i guess i'm still kinda on a break ... will get back to it soon(TM) but have had both other things to attend to and also the kernel grind started feeling too much like work

01:49 <urjaman> i mean i am running the C201 obviously but havent touched the software in a while :P

01:49 <alyssa> Relatable :p

01:50 <alyssa> // Plist BO size 14E000

01:50 <alyssa> .polygon_list_size = 0x13fe00,

01:50 <alyssa> // body offset 20992

01:50 <alyssa> 0x14E000 != 0x13FE00

01:52 <urjaman> oh i was at Assembly 2019 (computer festival, demoparty, lan party, whatever) and my C201 photobombed in an "official" (by assembly photo people) picture of my 3D printer

01:52 <alyssa> :)

01:53 <urjaman> https://assembly.galleria.fi/kuvat/Assembly+Summer+2019/TORSTAI/ASMS2019-EMMIHALMELA-4247.jpg

01:53 <alyssa> Okay, 30 bytes per tile... seems arbitray

01:53 <alyssa> ---And not even right either hrmph

01:55 <urjaman> very arbi tray indeed

01:55 <alyssa> Never such a thing in hw..

01:56 <alyssa> Also it's possible the blob overallocates somewhat

01:58 <urjaman> that is a random amount to overallocate by ...

01:58 <alyssa> urjaman: I mean I might not have anything to do with the tile count

02:00 davidlt has quit [Ping timeout: 244 seconds]

02:11 _whitelogger has joined #panfrost

02:12 davidlt has joined #panfrost

03:14 _whitelogger has joined #panfrost

03:26 _whitelogger has joined #panfrost

03:34 megi has quit [Ping timeout: 246 seconds]

04:41 davidlt_ has joined #panfrost

04:44 davidlt has quit [Ping timeout: 272 seconds]

04:49 bshah has joined #panfrost

05:00 vstehle has joined #panfrost

05:06 davidlt__ has joined #panfrost

05:09 davidlt_ has quit [Ping timeout: 244 seconds]

05:23 davidlt__ has quit [Read error: Connection reset by peer]

05:23 davidlt_ has joined #panfrost

05:26 davidlt_ has quit [Remote host closed the connection]

05:26 davidlt__ has joined #panfrost

05:31 davidlt__ has quit [Read error: Connection reset by peer]

05:34 davidlt has joined #panfrost

06:06 krh has quit [Ping timeout: 248 seconds]

07:21 pH5 has quit [Quit: bye]

07:40 davidlt has quit [Ping timeout: 245 seconds]

07:40 anarsoul has quit [Remote host closed the connection]

07:41 anarsoul has joined #panfrost

07:43 pH5 has joined #panfrost

08:04 jernej has quit [Ping timeout: 264 seconds]

08:23 davidlt has joined #panfrost

08:29 davidlt has quit [Remote host closed the connection]

08:29 davidlt has joined #panfrost

08:29 <tomeu> Prf_Jakob: is there a profiling mode that can tell me which are the tests that take most of the time?

08:35 afaerber has quit [Quit: Leaving]

08:55 davidlt_ has joined #panfrost

08:55 davidlt has quit [Read error: Connection reset by peer]

09:04 megi has joined #panfrost

09:47 davidlt_ has quit [Ping timeout: 246 seconds]

09:48 davidlt has joined #panfrost

10:44 <tomeu> Prf_Jakob: also, how can I print regressions and improvements but not already-known failures?

10:54 <daniels> alyssa, urjaman: i don't think it is related to the tile count - 0x13fe00 is only aligned to 512 bytes, whereas 0x14e000 is aligned up to 4096 bytes

10:55 <daniels> which makes sense - no sense allocating BOs which aren't aligned to page size

10:55 <urjaman> yes, but it's not the next 4k

10:57 <urjaman> (that'd be at 0x140000, so there's atleast 56k extra space)

10:58 <daniels> good point, and that would also be the next 64k boundary

11:02 <urjaman> but i suppose alyssa will figure it out, atleast if there's more examples ... tho i have one maybe silly thought

11:03 <urjaman> does the blob perform it's own memory allocation for these things? maybe it just padded upwards to the next thing instead of leaving a tiny hole

11:09 raster has joined #panfrost

12:08 raster has quit [Remote host closed the connection]

12:10 raster has joined #panfrost

12:26 davidlt has quit [Read error: Connection reset by peer]

12:27 davidlt has joined #panfrost

13:50 davidlt has quit [Read error: Connection reset by peer]

13:50 <shadeslayer> could someone shed some light on what CANARY means in a ralloc context?

13:50 davidlt_ has joined #panfrost

13:51 <alyssa> shadeslayer: https://en.wikipedia.org/wiki/Buffer_overflow_protection#Canaries

13:51 <cwabbott> shadeslayer: if you hit that assert, it probably means you passed a mem_ctx to a ralloc function that isn't actually a ralloc context or NULL

13:52 <cwabbott> or there's some kind of memory corruption

13:53 <cwabbott> it's a constant stored in every ralloc context and checked when you pass in a context

13:55 <shadeslayer> aha I see

13:55 <shadeslayer> thank you! :)

14:02 tgall_foo has quit [Read error: Connection reset by peer]

14:25 <daniels> right - generally with allocators, any kind of failure like that means that you're using memory after free, or have double-freed, or overflowed an allocation

14:27 <shadeslayer> what I'm confused by is the assert(info->canary == CANARY)

14:27 <daniels> shadeslayer: is this still the Plasma thing? if so, it's probably quicker to run it under valgrind than to spend two days trying to figure out what's going out without valgrind :P

14:27 <shadeslayer> so it will assert when the known value is found?

14:27 <daniels> err, other way around

14:27 <daniels> it will assert when the 'canary' member is not the static value CANARY

14:28 <shadeslayer> uhhh ...

14:28 <shadeslayer> assert(info->canary == CANARY);

14:28 <daniels> yeah

14:28 <daniels> if info->canary == CANARY, then it will continue

14:28 <daniels> if info->canary != CANARY, it will abort

14:28 <shadeslayer> oh right, ofcourse

14:28 <shadeslayer> right, what I see in gdb :

14:28 <shadeslayer> (gdb) print info->canary

14:29 <shadeslayer> $1 = 5902598

14:29 <shadeslayer> which is int for the hex value

14:29 <daniels> usually you use them to implement a pretty weak form of memory-overrun detection - for instance, struct { char my_buffer[128]; uint32_t canary = 0xdeadbeef; }

14:29 <daniels> doing that, if you write past 128 bytes of my_buffer, you'll overwrite the 'canary

14:29 <shadeslayer> so the value seems correct, but it still asserts?

14:29 <daniels> ' member with whatever you were going to write

14:29 <daniels> is that what you see at the assert point?

14:29 <shadeslayer> daniels: yeah

14:30 <daniels> ... when the assert has failed?

14:30 <shadeslayer> yes

14:30 <shadeslayer> daniels: https://paste.ubuntu.com/p/gDtbQ4Y88z/

14:30 <shadeslayer> which is why I'm so confused :P

14:31 <daniels> shadeslayer: is your access to the ralloc ctx protected by a mutex?

14:33 <shadeslayer> daniels: not that I can see

14:33 <daniels> shadeslayer: that would explain the value changing from underneath you then!

14:35 <shadeslayer> daniels: oh maybe, one of the other threads ( drm_import_bo ) has a lock

14:35 <shadeslayer> https://paste.ubuntu.com/p/dZzMdVKZGN/

14:36 <daniels> shadeslayer: assuming https://gitlab.freedesktop.org/shadeslayer/mesa/commits/mir-for-each-master is the latest?

14:37 <shadeslayer> daniels: yep, that + https://paste.ubuntu.com/p/MPJr6ZnH4S/

14:37 <shadeslayer> which just misc cleanup from having leftovers from my merge

14:40 davidlt_ has quit [Ping timeout: 245 seconds]

14:42 <alyssa> Okay, I've got my daily dose of Midgard compiler opt writing

14:42 <alyssa> https://gitlab.freedesktop.org/tomeu/mesa/commit/fd9110711d51eccd3c28b4e820c2052aae315316

14:42 <alyssa> Back to real work~

14:43 hopetech has quit [Ping timeout: 272 seconds]

14:43 hopetech has joined #panfrost

14:47 Stenzek has quit [Ping timeout: 264 seconds]

14:55 tlwoerner has quit [Ping timeout: 268 seconds]

14:59 <alyssa> daniels: The trick is to leak the canary value with a side channel and then include it when you overrun.

15:00 <shadeslayer> hah

15:02 jernej has joined #panfrost

15:03 tlwoerner has joined #panfrost

15:17 <shadeslayer> daniels: so I'm trying to also get valgrind setup btw

15:17 <shadeslayer> but apparently it's not happy with kwin_wayland valgrind plasmashell

15:19 <daniels> how about valgrind --follow-children=yes kwin_wayland plasmashell?

15:19 <daniels> or maybe it was --track-children

15:21 <shadeslayer> daniels: Though do we really want to track kwin_wayland too?

15:21 <shadeslayer> seeing how the crash is triggered in plasmashell

15:21 <shadeslayer> ( I have it working now fwiw )

15:22 <daniels> working \o/

15:25 <alyssa> Weeeee

15:32 <shadeslayer> daniels: http://paste.ubuntu.com/p/VnHWgwJyTz/

15:32 <shadeslayer> well, I meant valgrind

15:37 <alyssa> I have given up pretending I know what I'm doing right now

15:39 <daniels> shadeslayer: well, there you have it - the block at line 846 tells you that we're trying to access a pan_job after it's been free()d

15:43 <daniels> seems like quite a mess inside panfrost_drm_force_flush_fragment()

15:43 <daniels> for some reason we flush and free the job, but then we later end up with that job still being current and being flushed again

15:43 <daniels> shadeslayer: which bits can I help you step through?

15:44 <alyssa> daniels: "seems like..." You sound surprised!

15:44 <shadeslayer> heh

15:44 <shadeslayer> daniels: so this is essentially 2 different threads trying to free the same job?

15:45 <shadeslayer> one of them being the blitting and the other one a eglswap?

15:47 <daniels> shadeslayer: that shouldn't be possible, since they're within the same EGLContext, and you cannot have the same context current in multiple threads

15:48 <daniels> shadeslayer: i would just start by printf tbh: every time you create a job, every time you assign a job to screen->last_job, every time you change screen->last_fragment_flushed (the condition which controls whether or not we try to free the job inside panfrost_drm_force_flush_fragment!), every time we free a job - print that out including the job pointer

15:48 <daniels> and then eventually unravel why it is that we free a job that we end up later trying to use

15:52 <alyssa> daniels: What is the deal with multithreading in GL?

15:53 <daniels> alyssa: btw, you'd be surprised how unsurprised I can sound sometimes :P the more-numerous-by-the-week grey hairs in my beard didn't come a) from nowhere or b) without a well-practiced wary tone

15:53 jernej has quit [Remote host closed the connection]

15:54 <daniels> anyway, assuming 'what's the deal with ... ?' wasn't a Seinfeld-style lead-in, basically you can have multiple EGLContexts created from a single EGLDisplay, but each context can only be current in at least one thread simultaneously

15:54 <daniels> so if you have a per-FD BO cache, for example (which you do need), that needs to be mutexed because it's entirely possible for multiple contexts to be working simultaneously on the same device

15:55 <alyssa> Aaahh ok

15:55 <daniels> however, pretty much all the GL state and objects are context-local, so you don't have to e.g. mutex every single FBO access

15:56 pH5 has quit [Quit: bye]

15:56 davidlt_ has joined #panfrost

15:56 jernej has joined #panfrost

15:59 <shadeslayer> daniels: re printf, lovely

16:00 <daniels> shadeslayer: (and from there you can start to drill through the call tree and surrounding context to find out _why_ those jobs are being allocated/assigned/flushed/freed when they are, and thus where the logic error is that leads us to be using a job we've already freed)

16:02 <shadeslayer> roger, I'll try to spend some time on this, though I'm on vacation starting tomorrow

16:02 * alyssa tries to figure out unk0 encoding

16:02 <alyssa> It's very nearly linear

16:02 <alyssa> I'm guessing they have some alignments thrown in or something

16:02 <shadeslayer> so might really only get to it by mid next week

16:03 <daniels> shadeslayer: oh, so you are - we can pick it up next week then

16:05 <shadeslayer> aye

16:06 <shadeslayer> daniels: alyssa do you reckon these BO Cache + MIR patches can be merged?

16:06 raster has quit [Remote host closed the connection]

16:06 <shadeslayer> maybe I can finish polishing them up, since this use after free is a separate issue?

16:07 <shadeslayer> or do you reckon it's better to fix it all up?

16:07 <daniels> well, if the MIR-iterator patch is positively reviewed then we could definitely stick that in

16:07 <daniels> but is the pan_job UAF definitely not related to the BO cache?

16:07 raster has joined #panfrost

16:10 <shadeslayer> daniels: I mean ... it probably is, just wasn't sure if that would block things since without that BO Cache you'd still hit similar issues with importing bo's

16:13 <alyssa> daniels: Science (TM)

16:13 <alyssa> https://people.collabora.com/~alyssa/Figure_1.png

16:14 <alyssa> (Actually the X axis is off-by-one, and the Y should be in hex to be legible, but meh)

16:14 <daniels> shadeslayer: tbh I think it's best to just leave the BO cache until we at least understand what the failure is

16:14 <shadeslayer> daniels: ack

16:14 <daniels> but pushing Boris's MIR iterator patches sounds like a good idea

16:14 <daniels> alyssa: ^?

16:15 <alyssa> daniels: Pretty sure I r-b'd it

16:15 <alyssa> And if I didn't I'm pretty sure I had a reason not to

16:15 <daniels> heh

16:15 <daniels> shadeslayer doesn't have commit rights tho, and Boris is on holiday this week

16:16 <alyssa> daniels: Oh, yeah. I was waiting on a v2 from Boris.

16:16 <daniels> (i love the sublinear behaviour around multiples of four! beautiful)

16:16 <alyssa> (and the v2 would include a list.h change so I couldn't push myself anyway, would need a review)

16:16 davidlt_ is now known as davidlt

16:16 <daniels> fair enough

16:17 <alyssa> (Aside: matplotlib is stupidly easy to use. Like I already had my data in my Python notebook... it was just a few lines later to get a pretty graph)

16:18 <daniels> 'notebook' like jupyter, or?

16:18 <alyssa> notebook like vim, random paper, and a pencil

16:19 <alyssa> notebook like vim, random paper, and a pencil

16:19 <alyssa> oops

16:20 <alyssa> So the nice thing about the sublinear behaviour is that I can compute forward differences to get something more legible:

16:21 <alyssa> er, I guess quasi-linear? piecewise linear?

16:21 <alyssa> [64, 64, 64, 32, 64, 64, 64, 32, 65, 64, 64, 32]

16:21 <alyssa> And so I can pretty easily fit a curve to the forward differences --

16:21 <alyssa> clearly (32 if last of 4, 64 otherwise) | (1 if ??? else 0)

16:22 <alyssa> And then manipulate it that way

16:22 <alyssa> Although I *strongly* suspect the graph we're looking at is from rounding an unaligned product

16:23 <alyssa> That's alright. We'll see soon enough.

16:23 <alyssa> s/|/^/

16:23 <alyssa> Er, not even.... hm

16:23 <alyssa> Okay, so if we take the table and & ~1, we can ignore that bit for now

16:25 <alyssa> Oh hum huh

16:25 <alyssa> Alright. Have stuff in Python, now onto the notebook :p

16:28 <alyssa> daniels: Re the "unaligned rounding" --- just as a visualization exercise, imagine zooming in on an oblique line drawn with Bresenham's and no antialiasing, and then drawing smooth lines between connected pixels

16:28 <alyssa> It would look something like that graph, yeah?

16:29 <daniels> heh

16:29 <alyssa> Granted I'm not much of a visual person so I'm not sure why I'm doing it this way but you know :p

16:29 <alyssa> I don't usually use graphs; it's nice to change things up sometimes :)

16:34 tgall_foo has joined #panfrost

16:38 pH5 has joined #panfrost

16:41 unoccupied has quit [Ping timeout: 268 seconds]

16:54 <alyssa> 2 pages of maths (by hand) later and I have a nice closed form expression ... I think ...

16:54 * alyssa is going to have to try symbolic computing for RE at some point

16:54 raster has quit [Remote host closed the connection]

16:57 raster has joined #panfrost

16:57 <alyssa> Okay, yes, my closed form expression is correct (Pythonified it)

16:57 <alyssa> (T0 + 32*l + (32*3)*math.floor(l / 4) + 32 * (l % 4))

16:57 <alyssa> Next up is simplifying that since we have two goals

16:57 <alyssa> #1 is being able to compute the quantity (and that expression is fine for it)

16:58 <alyssa> #2 is figuring out _why_ that works, which is usually the more interesting part

16:58 <alyssa> By having #1 done first with a nice expression like that, it may be possible to unlock the s3cr3ts with just some algebra

17:02 <alyssa> Noting that (l mod 4) = l - 4 floor (l / 4),

17:02 <alyssa> you can get a much nicer expression (which I should have been able to do from the get-go but I've been too buried in abstraction to see it)

17:02 <alyssa> (T0 + 64*l - 32*math.floor(l / 4))

17:03 <alyssa> Now, the final leap will be, I think..... let's see..

17:06 <alyssa> Yes, there, pull the 32 out and then push the 2*l into the floor

17:06 <alyssa> (T0 + 32*math.floor(7*l / 4))

17:07 <alyssa> daniels: ^^^ Like I predicted. Linear with a non-integral slope.

17:07 <alyssa> (If anyone wants me to scan my 3 pages of brain slop for their entertainment, lmk :P)

17:09 <alyssa> Case closed, then? Well.. no

17:10 <alyssa> We still have that bottom odd/even thing to deal with

17:12 <alyssa> And this was all based on having an array on the stack

17:12 <alyssa> What if I had actual register spilling...? Is that different?

17:18 <alyssa> Register spilling is definitely different and not for any obvious rerason.

17:20 <alyssa> register spilling is something more like:

17:21 <alyssa> 0x1E4 + (bytes spilled / [big constant])

17:22 <alyssa> I'm trying to understand the essential difference between register spilling and local arrays

17:22 <alyssa> Both are scratch

17:40 unoccupied has joined #panfrost

17:50 raster has quit [Remote host closed the connection]

18:01 stikonas has joined #panfrost

18:06 raster has joined #panfrost

18:07 raster has quit [Remote host closed the connection]

18:18 * alyssa moves onto other things

18:18 <alyssa> There's only so much Hard Thinking (TM) I can do at once,

18:18 <alyssa> back to just making pandecode work better.

18:36 <alyssa> So here is a magic field I need to sort out pretty bad

18:37 <alyssa> Sometimes zero, sometimes 0xFFFF

18:37 <alyssa> On glmark, -bbuild/texture/shading, setting to 0xFFFF is fine but setting to zero causes... half of the polygons to disappear?

18:37 <alyssa> Like every other triangle is missing

18:38 <alyssa> But whether that effect happens depend on...... something..

18:39 <alyssa> Setting to 0x8888 as an example seems ok

18:40 <alyssa> This field is smushed between offset_bias_correction and index_count so there's that

18:40 <alyssa> Some detail of indexing I guess

18:44 <alyssa> With the blob I'm seeing it set for wallpapering

18:44 <alyssa> But it's not at all obvious why this is wallpapering at all

18:45 <alyssa> Oh, it's not wallpapering, it's using the 3D pipe to blit a texture from linear to AFBC

18:46 <alyssa> (It's good to know the blob does that on the 3D pipe rather than cheating in software)

18:52 <alyssa> What's funny is that the blob doesn't set zero1 for -bbuild

18:57 * alyssa tries to do bit-identical glmark -bbuild

18:58 <alyssa> Of course now I'm just fighting scoreboaridng

19:05 <alyssa> Oh, that bug is *subtle*

19:08 <alyssa> Fixed, onwards I guess

19:15 <alyssa> Okay.

19:15 <alyssa> draw_flags

19:15 <alyssa> on the TILER

19:15 <alyssa> megablink

19:21 <alyssa> oh you've got to be kidding me

19:21 <alyssa> (0x3000, 0xFFFF)

19:21 <alyssa> (0x18000, 0x0)

19:21 <alyssa> Those unknown_draw/zero1 values have to be paired

19:21 <alyssa> But I have no idea what *either* field means.

20:36 stikonas has quit [Ping timeout: 245 seconds]

20:37 stikonas has joined #panfrost

20:59 pH5 has quit [Quit: -_-]

21:06 * alyssa has been busy making attribute_meta printing not bad

21:21 <alyssa> https://people.collabora.com/~alyssa/better.txt

21:22 <alyssa> ^ If y'all don't find that adorable, I dunno what to tell ya!

21:31 * alyssa would like to give textures/framebuffers similar treatment

21:37 <alyssa> As an aside, -bbuild's trace is down to <370 lines, so we're definitely making progress in that respect

21:45 stikonas has quit [Remote host closed the connection]

22:33 _whitelogger has joined #panfrost

22:39 _whitelogger has joined #panfrost

23:39 _whitelogger has joined #panfrost

23:49 unoccupied has quit [Ping timeout: 245 seconds]

23:49 unoccupied has joined #panfrost

23:56 sravn has quit [Remote host closed the connection]