<rellla>
if i change that to 80, i get less lines and the test passes. 128 results in a blue square for both, result and reference picture, and 763 displays the right lines at the beginning of the draw but then sth goes wrong.
<rellla>
so imho it's some buffer/memory related issue...
maccraft123 has joined #lima
maccraft123 has quit [Quit: WeeChat 2.6]
monstr has quit [Remote host closed the connection]
<rellla>
heyo, i think i found sth related.
<rellla>
info->count always comes with a MAX of 129 to lima_draw. so the above test uses a count of 763 and this might be the reason, why it breaks.
<anarsoul>
rellla: issue should be fixed in lima, not in deqp
<rellla>
yeah
<rellla>
i'll look into that later. at least i know, that the issue is somehow related to nr of vertices
<rellla>
it seems lima is not able to draw more than 125 vertices correctly in one draw.
<anarsoul>
that's probably for line strips
<anarsoul>
not sure if it's correct though since enunes' branch actually fixes glmark2 -b refract which has more than 65k vertices
<anarsoul>
maybe we under-allocate some buffer?
megi has quit [Ping timeout: 250 seconds]
yann|work has joined #lima
rembrandt83 has joined #lima
<rembrandt83>
Finally got my laptop from repair, power button was replaced.
<anarsoul>
rembrandt83: great
<rembrandt83>
So I looked at proper lookup table indexing from bitfields , this is one of the most sophisticated problems to me so far.
<rembrandt83>
how to remap roots and logarithms to make them work with low latency
<rembrandt83>
the tables seem easy to me, and probably one can index into them with filtering units or texture mapping units, but i haven't got much experience and knowhow on vertex shaders.
<rembrandt83>
so you have some bitfield combination with filtering some power of two out from the modulus operation, now you have a base and remainder of that operation, this should be the index, how to clamp it into correct register is the problem.
<rembrandt83>
the mirror odd even filtering mode seems good, but is there something similar in hw for vertex programs too, if texture lookup is missing on the vertex shader?
<rembrandt83>
glpointsize has some hw clamping stuff, but i yet have not identified how that works
<rembrandt83>
so the issue is, the coord normalization methods are quite oftenly used as divisions and logarithms and roots probably
<rembrandt83>
you may have several twos complement cache buffers to get the info from, i am not sure how i redirect it cheaply to the correct one
megi has joined #lima
<rembrandt83>
transform_feedback isn't kinda there for mali 400mp i assume or is there this chanche, some describe something like vertex texture fetch VTF. without PBOs this is going to put the passthrough fragment program texture info back to CPU and then into VBOs
<rembrandt83>
what i think either some intelligent rounding instruction is needed on vertex shaders or every divisor of power of two needs to have a priority encoder to the constant cache for instance, to fetch the magic number from there, since there is 32powers then 32 buffers
<rembrandt83>
anyhow all this is very complex. i am not entirely in woods with this one, having read literature in stacks too
<rembrandt83>
but i do not like that fast memory as data cache is wasted on page memory while they can be iteratively pinned to store precomputed values of long latency ops
<rembrandt83>
page tables are not very intelligent to waste cache on
<rembrandt83>
most those theorems are bit overkill to me too though, highly hard to follow especially if i can not rely on very good math skills, which i never had.
<rembrandt83>
so after having cracked a single one like lagrange quadtratic interpolation with resillient reading and following, and it feels like a lot of unoptimized one , all the time is wasted since there literally gazillions of theorems on the net
<rembrandt83>
the data path is always going to be coarse grain sort of, but hw makes a lot of 32bit 4byte cache entries and regs available to be used, so it isn't much of a problem to waste them on priority encoder
<rembrandt83>
since accessing those coarse grain regs will be always very fast on short pipeline mode
maccraft123 has joined #lima
<rembrandt83>
effective address calculator has enough components but it all is pretty perverse to think about those solutions every day with no possibility to negotiate with anyone to cooperate
<rembrandt83>
alone doing it is pretty mad, but this is all i got at the moment injuries took my out of other events long time allready, i think i will branch out doing all my own anyways
<rembrandt83>
my/me
rembrandt83 has quit [Quit: CGI:IRC (EOF)]
maccraft123 has quit [Quit: sleep]
cwabbott has quit [Remote host closed the connection]