alyssa changed the topic of #panfrost to: Panfrost - FLOSS Mali Midgard & Bifrost - https://gitlab.freedesktop.org/panfrost - Logs https://freenode.irclog.whitequark.org/panfrost - <daniels> avoiding X is a huge feature
tgall_foo has joined #panfrost
vstehle has quit [Ping timeout: 268 seconds]
robink has joined #panfrost
<alyssa> Woah woah that was something about the hardware I didn't need to learn.
<urjaman> o.O?
<alyssa> urjaman: The polygon list BO has to be larger than reported in `polygon_list_size`
<urjaman> okay good, nothing exploded :P
<alyssa> :p
<alyssa> urjaman: How goes the Chromebooking and Panfrosting these days?
<urjaman> i guess i'm still kinda on a break ... will get back to it soon(TM) but have had both other things to attend to and also the kernel grind started feeling too much like work
<urjaman> i mean i am running the C201 obviously but havent touched the software in a while :P
<alyssa> Relatable :p
<alyssa> // Plist BO size 14E000
<alyssa> .polygon_list_size = 0x13fe00,
<alyssa> // body offset 20992
<alyssa> 0x14E000 != 0x13FE00
<urjaman> oh i was at Assembly 2019 (computer festival, demoparty, lan party, whatever) and my C201 photobombed in an "official" (by assembly photo people) picture of my 3D printer
<alyssa> :)
<alyssa> Okay, 30 bytes per tile... seems arbitray
<alyssa> ---And not even right either hrmph
<urjaman> very arbi tray indeed
<alyssa> Never such a thing in hw..
<alyssa> Also it's possible the blob overallocates somewhat
<urjaman> that is a random amount to overallocate by ...
<alyssa> urjaman: I mean I might not have anything to do with the tile count
davidlt has quit [Ping timeout: 244 seconds]
_whitelogger has joined #panfrost
davidlt has joined #panfrost
_whitelogger has joined #panfrost
_whitelogger has joined #panfrost
megi has quit [Ping timeout: 246 seconds]
davidlt_ has joined #panfrost
davidlt has quit [Ping timeout: 272 seconds]
bshah has joined #panfrost
vstehle has joined #panfrost
davidlt__ has joined #panfrost
davidlt_ has quit [Ping timeout: 244 seconds]
davidlt__ has quit [Read error: Connection reset by peer]
davidlt_ has joined #panfrost
davidlt_ has quit [Remote host closed the connection]
davidlt__ has joined #panfrost
davidlt__ has quit [Read error: Connection reset by peer]
davidlt has joined #panfrost
krh has quit [Ping timeout: 248 seconds]
pH5 has quit [Quit: bye]
davidlt has quit [Ping timeout: 245 seconds]
anarsoul has quit [Remote host closed the connection]
anarsoul has joined #panfrost
pH5 has joined #panfrost
jernej has quit [Ping timeout: 264 seconds]
davidlt has joined #panfrost
davidlt has quit [Remote host closed the connection]
davidlt has joined #panfrost
<tomeu> Prf_Jakob: is there a profiling mode that can tell me which are the tests that take most of the time?
afaerber has quit [Quit: Leaving]
davidlt_ has joined #panfrost
davidlt has quit [Read error: Connection reset by peer]
megi has joined #panfrost
davidlt_ has quit [Ping timeout: 246 seconds]
davidlt has joined #panfrost
<tomeu> Prf_Jakob: also, how can I print regressions and improvements but not already-known failures?
<daniels> alyssa, urjaman: i don't think it is related to the tile count - 0x13fe00 is only aligned to 512 bytes, whereas 0x14e000 is aligned up to 4096 bytes
<daniels> which makes sense - no sense allocating BOs which aren't aligned to page size
<urjaman> yes, but it's not the next 4k
<urjaman> (that'd be at 0x140000, so there's atleast 56k extra space)
<daniels> good point, and that would also be the next 64k boundary
<urjaman> but i suppose alyssa will figure it out, atleast if there's more examples ... tho i have one maybe silly thought
<urjaman> does the blob perform it's own memory allocation for these things? maybe it just padded upwards to the next thing instead of leaving a tiny hole
raster has joined #panfrost
raster has quit [Remote host closed the connection]
raster has joined #panfrost
davidlt has quit [Read error: Connection reset by peer]
davidlt has joined #panfrost
davidlt has quit [Read error: Connection reset by peer]
<shadeslayer> could someone shed some light on what CANARY means in a ralloc context?
davidlt_ has joined #panfrost
<cwabbott> shadeslayer: if you hit that assert, it probably means you passed a mem_ctx to a ralloc function that isn't actually a ralloc context or NULL
<cwabbott> or there's some kind of memory corruption
<cwabbott> it's a constant stored in every ralloc context and checked when you pass in a context
<shadeslayer> aha I see
<shadeslayer> thank you! :)
tgall_foo has quit [Read error: Connection reset by peer]
<daniels> right - generally with allocators, any kind of failure like that means that you're using memory after free, or have double-freed, or overflowed an allocation
<shadeslayer> what I'm confused by is the assert(info->canary == CANARY)
<daniels> shadeslayer: is this still the Plasma thing? if so, it's probably quicker to run it under valgrind than to spend two days trying to figure out what's going out without valgrind :P
<shadeslayer> so it will assert when the known value is found?
<daniels> err, other way around
<daniels> it will assert when the 'canary' member is not the static value CANARY
<shadeslayer> uhhh ...
<shadeslayer> assert(info->canary == CANARY);
<daniels> yeah
<daniels> if info->canary == CANARY, then it will continue
<daniels> if info->canary != CANARY, it will abort
<shadeslayer> oh right, ofcourse
<shadeslayer> right, what I see in gdb :
<shadeslayer> (gdb) print info->canary
<shadeslayer> $1 = 5902598
<shadeslayer> which is int for the hex value
<daniels> usually you use them to implement a pretty weak form of memory-overrun detection - for instance, struct { char my_buffer[128]; uint32_t canary = 0xdeadbeef; }
<daniels> doing that, if you write past 128 bytes of my_buffer, you'll overwrite the 'canary
<shadeslayer> so the value seems correct, but it still asserts?
<daniels> ' member with whatever you were going to write
<daniels> is that what you see at the assert point?
<shadeslayer> daniels: yeah
<daniels> ... when the assert has failed?
<shadeslayer> yes
<shadeslayer> which is why I'm so confused :P
<daniels> shadeslayer: is your access to the ralloc ctx protected by a mutex?
<shadeslayer> daniels: not that I can see
<daniels> shadeslayer: that would explain the value changing from underneath you then!
<shadeslayer> daniels: oh maybe, one of the other threads ( drm_import_bo ) has a lock
<shadeslayer> daniels: yep, that + https://paste.ubuntu.com/p/MPJr6ZnH4S/
<shadeslayer> which just misc cleanup from having leftovers from my merge
davidlt_ has quit [Ping timeout: 245 seconds]
<alyssa> Okay, I've got my daily dose of Midgard compiler opt writing
<alyssa> Back to real work~
hopetech has quit [Ping timeout: 272 seconds]
hopetech has joined #panfrost
Stenzek has quit [Ping timeout: 264 seconds]
tlwoerner has quit [Ping timeout: 268 seconds]
<alyssa> daniels: The trick is to leak the canary value with a side channel and then include it when you overrun.
<shadeslayer> hah
jernej has joined #panfrost
tlwoerner has joined #panfrost
<shadeslayer> daniels: so I'm trying to also get valgrind setup btw
<shadeslayer> but apparently it's not happy with kwin_wayland valgrind plasmashell
<daniels> how about valgrind --follow-children=yes kwin_wayland plasmashell?
<daniels> or maybe it was --track-children
<shadeslayer> daniels: Though do we really want to track kwin_wayland too?
<shadeslayer> seeing how the crash is triggered in plasmashell
<shadeslayer> ( I have it working now fwiw )
<daniels> working \o/
<alyssa> Weeeee
<shadeslayer> well, I meant valgrind
<alyssa> I have given up pretending I know what I'm doing right now
<daniels> shadeslayer: well, there you have it - the block at line 846 tells you that we're trying to access a pan_job after it's been free()d
<daniels> seems like quite a mess inside panfrost_drm_force_flush_fragment()
<daniels> for some reason we flush and free the job, but then we later end up with that job still being current and being flushed again
<daniels> shadeslayer: which bits can I help you step through?
<alyssa> daniels: "seems like..." You sound surprised!
<shadeslayer> heh
<shadeslayer> daniels: so this is essentially 2 different threads trying to free the same job?
<shadeslayer> one of them being the blitting and the other one a eglswap?
<daniels> shadeslayer: that shouldn't be possible, since they're within the same EGLContext, and you cannot have the same context current in multiple threads
<daniels> shadeslayer: i would just start by printf tbh: every time you create a job, every time you assign a job to screen->last_job, every time you change screen->last_fragment_flushed (the condition which controls whether or not we try to free the job inside panfrost_drm_force_flush_fragment!), every time we free a job - print that out including the job pointer
<daniels> and then eventually unravel why it is that we free a job that we end up later trying to use
<alyssa> daniels: What is the deal with multithreading in GL?
<daniels> alyssa: btw, you'd be surprised how unsurprised I can sound sometimes :P the more-numerous-by-the-week grey hairs in my beard didn't come a) from nowhere or b) without a well-practiced wary tone
jernej has quit [Remote host closed the connection]
<daniels> anyway, assuming 'what's the deal with ... ?' wasn't a Seinfeld-style lead-in, basically you can have multiple EGLContexts created from a single EGLDisplay, but each context can only be current in at least one thread simultaneously
<daniels> so if you have a per-FD BO cache, for example (which you do need), that needs to be mutexed because it's entirely possible for multiple contexts to be working simultaneously on the same device
<alyssa> Aaahh ok
<daniels> however, pretty much all the GL state and objects are context-local, so you don't have to e.g. mutex every single FBO access
pH5 has quit [Quit: bye]
davidlt_ has joined #panfrost
jernej has joined #panfrost
<shadeslayer> daniels: re printf, lovely
<daniels> shadeslayer: (and from there you can start to drill through the call tree and surrounding context to find out _why_ those jobs are being allocated/assigned/flushed/freed when they are, and thus where the logic error is that leads us to be using a job we've already freed)
<shadeslayer> roger, I'll try to spend some time on this, though I'm on vacation starting tomorrow
* alyssa tries to figure out unk0 encoding
<alyssa> It's very nearly linear
<alyssa> I'm guessing they have some alignments thrown in or something
<shadeslayer> so might really only get to it by mid next week
<daniels> shadeslayer: oh, so you are - we can pick it up next week then
<shadeslayer> aye
<shadeslayer> daniels: alyssa do you reckon these BO Cache + MIR patches can be merged?
raster has quit [Remote host closed the connection]
<shadeslayer> maybe I can finish polishing them up, since this use after free is a separate issue?
<shadeslayer> or do you reckon it's better to fix it all up?
<daniels> well, if the MIR-iterator patch is positively reviewed then we could definitely stick that in
<daniels> but is the pan_job UAF definitely not related to the BO cache?
raster has joined #panfrost
<shadeslayer> daniels: I mean ... it probably is, just wasn't sure if that would block things since without that BO Cache you'd still hit similar issues with importing bo's
<alyssa> daniels: Science (TM)
<alyssa> (Actually the X axis is off-by-one, and the Y should be in hex to be legible, but meh)
<daniels> shadeslayer: tbh I think it's best to just leave the BO cache until we at least understand what the failure is
<shadeslayer> daniels: ack
<daniels> but pushing Boris's MIR iterator patches sounds like a good idea
<daniels> alyssa: ^?
<alyssa> daniels: Pretty sure I r-b'd it
<alyssa> And if I didn't I'm pretty sure I had a reason not to
<daniels> heh
<daniels> shadeslayer doesn't have commit rights tho, and Boris is on holiday this week
<alyssa> daniels: Oh, yeah. I was waiting on a v2 from Boris.
<daniels> (i love the sublinear behaviour around multiples of four! beautiful)
<alyssa> (and the v2 would include a list.h change so I couldn't push myself anyway, would need a review)
davidlt_ is now known as davidlt
<daniels> fair enough
<alyssa> (Aside: matplotlib is stupidly easy to use. Like I already had my data in my Python notebook... it was just a few lines later to get a pretty graph)
<daniels> 'notebook' like jupyter, or?
<alyssa> notebook like vim, random paper, and a pencil
<alyssa> notebook like vim, random paper, and a pencil
<alyssa> oops
<alyssa> So the nice thing about the sublinear behaviour is that I can compute forward differences to get something more legible:
<alyssa> er, I guess quasi-linear? piecewise linear?
<alyssa> [64, 64, 64, 32, 64, 64, 64, 32, 65, 64, 64, 32]
<alyssa> And so I can pretty easily fit a curve to the forward differences --
<alyssa> clearly (32 if last of 4, 64 otherwise) | (1 if ??? else 0)
<alyssa> And then manipulate it that way
<alyssa> Although I *strongly* suspect the graph we're looking at is from rounding an unaligned product
<alyssa> That's alright. We'll see soon enough.
<alyssa> s/|/^/
<alyssa> Er, not even.... hm
<alyssa> Okay, so if we take the table and & ~1, we can ignore that bit for now
<alyssa> Oh hum huh
<alyssa> Alright. Have stuff in Python, now onto the notebook :p
<alyssa> daniels: Re the "unaligned rounding" --- just as a visualization exercise, imagine zooming in on an oblique line drawn with Bresenham's and no antialiasing, and then drawing smooth lines between connected pixels
<alyssa> It would look something like that graph, yeah?
<daniels> heh
<alyssa> Granted I'm not much of a visual person so I'm not sure why I'm doing it this way but you know :p
<alyssa> I don't usually use graphs; it's nice to change things up sometimes :)
tgall_foo has joined #panfrost
pH5 has joined #panfrost
unoccupied has quit [Ping timeout: 268 seconds]
<alyssa> 2 pages of maths (by hand) later and I have a nice closed form expression ... I think ...
* alyssa is going to have to try symbolic computing for RE at some point
raster has quit [Remote host closed the connection]
raster has joined #panfrost
<alyssa> Okay, yes, my closed form expression is correct (Pythonified it)
<alyssa> (T0 + 32*l + (32*3)*math.floor(l / 4) + 32 * (l % 4))
<alyssa> Next up is simplifying that since we have two goals
<alyssa> #1 is being able to compute the quantity (and that expression is fine for it)
<alyssa> #2 is figuring out _why_ that works, which is usually the more interesting part
<alyssa> By having #1 done first with a nice expression like that, it may be possible to unlock the s3cr3ts with just some algebra
<alyssa> Noting that (l mod 4) = l - 4 floor (l / 4),
<alyssa> you can get a much nicer expression (which I should have been able to do from the get-go but I've been too buried in abstraction to see it)
<alyssa> (T0 + 64*l - 32*math.floor(l / 4))
<alyssa> Now, the final leap will be, I think..... let's see..
<alyssa> Yes, there, pull the 32 out and then push the 2*l into the floor
<alyssa> (T0 + 32*math.floor(7*l / 4))
<alyssa> daniels: ^^^ Like I predicted. Linear with a non-integral slope.
<alyssa> (If anyone wants me to scan my 3 pages of brain slop for their entertainment, lmk :P)
<alyssa> Case closed, then? Well.. no
<alyssa> We still have that bottom odd/even thing to deal with
<alyssa> And this was all based on having an array on the stack
<alyssa> What if I had actual register spilling...? Is that different?
<alyssa> Register spilling is definitely different and not for any obvious rerason.
<alyssa> register spilling is something more like:
<alyssa> 0x1E4 + (bytes spilled / [big constant])
<alyssa> I'm trying to understand the essential difference between register spilling and local arrays
<alyssa> Both are scratch
unoccupied has joined #panfrost
raster has quit [Remote host closed the connection]
stikonas has joined #panfrost
raster has joined #panfrost
raster has quit [Remote host closed the connection]
* alyssa moves onto other things
<alyssa> There's only so much Hard Thinking (TM) I can do at once,
<alyssa> back to just making pandecode work better.
<alyssa> So here is a magic field I need to sort out pretty bad
<alyssa> Sometimes zero, sometimes 0xFFFF
<alyssa> On glmark, -bbuild/texture/shading, setting to 0xFFFF is fine but setting to zero causes... half of the polygons to disappear?
<alyssa> Like every other triangle is missing
<alyssa> But whether that effect happens depend on...... something..
<alyssa> Setting to 0x8888 as an example seems ok
<alyssa> This field is smushed between offset_bias_correction and index_count so there's that
<alyssa> Some detail of indexing I guess
<alyssa> With the blob I'm seeing it set for wallpapering
<alyssa> But it's not at all obvious why this is wallpapering at all
<alyssa> Oh, it's not wallpapering, it's using the 3D pipe to blit a texture from linear to AFBC
<alyssa> (It's good to know the blob does that on the 3D pipe rather than cheating in software)
<alyssa> What's funny is that the blob doesn't set zero1 for -bbuild
* alyssa tries to do bit-identical glmark -bbuild
<alyssa> Of course now I'm just fighting scoreboaridng
<alyssa> Oh, that bug is *subtle*
<alyssa> Fixed, onwards I guess
<alyssa> Okay.
<alyssa> draw_flags
<alyssa> on the TILER
<alyssa> megablink
<alyssa> oh you've got to be kidding me
<alyssa> (0x3000, 0xFFFF)
<alyssa> (0x18000, 0x0)
<alyssa> Those unknown_draw/zero1 values have to be paired
<alyssa> But I have no idea what *either* field means.
stikonas has quit [Ping timeout: 245 seconds]
stikonas has joined #panfrost
pH5 has quit [Quit: -_-]
* alyssa has been busy making attribute_meta printing not bad
<alyssa> ^ If y'all don't find that adorable, I dunno what to tell ya!
* alyssa would like to give textures/framebuffers similar treatment
<alyssa> As an aside, -bbuild's trace is down to <370 lines, so we're definitely making progress in that respect
stikonas has quit [Remote host closed the connection]
_whitelogger has joined #panfrost
_whitelogger has joined #panfrost
_whitelogger has joined #panfrost
unoccupied has quit [Ping timeout: 245 seconds]
unoccupied has joined #panfrost
sravn has quit [Remote host closed the connection]