#linux-amlogic on 2018-04-08 — irc logs at freenode.irclog.whitequark.org

2018-03-27 08:09 narmstrong changed the topic of #linux-amlogic to: Amlogic mainline kernel development discussion - our wiki http://linux-meson.com/ - ml linux-amlogic@lists.infradead.org - Publicly Logged on https://irclog.whitequark.org/linux-amlogic

00:26 sputnik_ has quit [Remote host closed the connection]

00:26 sputnik_ has joined #linux-amlogic

00:59 default__ has joined #linux-amlogic

01:01 sputnik_ has quit [Read error: Connection reset by peer]

01:01 sputnik_ has joined #linux-amlogic

01:03 cthugha has quit [Ping timeout: 264 seconds]

01:49 chewitt has quit [Max SendQ exceeded]

01:51 chewitt has joined #linux-amlogic

04:03 chewitt_ has joined #linux-amlogic

04:05 chewitt has quit [Ping timeout: 268 seconds]

04:14 chewitt_ has quit [Ping timeout: 260 seconds]

05:12 BlueMatt has quit [Ping timeout: 240 seconds]

05:12 BlueMatt has joined #linux-amlogic

05:36 Elpaulo_m has quit [Ping timeout: 240 seconds]

05:37 Elpaulo_m has joined #linux-amlogic

06:52 jakogut has joined #linux-amlogic

06:54 jakogut has quit [Client Quit]

07:10 <narmstrong> Elpaulo: s905 and s905x are the same concerning the Mali

07:10 sputnik__ has joined #linux-amlogic

07:11 sputnik_ has quit [Ping timeout: 265 seconds]

07:16 chewitt has joined #linux-amlogic

07:49 trem has joined #linux-amlogic

07:49 <Elpaulo> narmstrong: Did you solve the gbm problem? https://paste.ee/p/N6T57

09:25 tingoose has joined #linux-amlogic

09:25 tingoose_ has joined #linux-amlogic

09:35 focus has joined #linux-amlogic

09:48 afaerber has joined #linux-amlogic

10:08 Elpaulo_m has quit [Ping timeout: 240 seconds]

10:11 Elpaulo_m has joined #linux-amlogic

10:18 Elpaulo_m has quit [Ping timeout: 260 seconds]

10:25 Elpaulo_m has joined #linux-amlogic

10:47 Elpaulo_m has quit [Ping timeout: 256 seconds]

11:38 Ivanovic has quit [Ping timeout: 265 seconds]

11:51 focus_ has joined #linux-amlogic

12:01 Ivanovic has joined #linux-amlogic

12:04 afaerber has quit [Remote host closed the connection]

12:06 afaerber has joined #linux-amlogic

13:07 AUser has quit [Ping timeout: 240 seconds]

13:09 AUser has joined #linux-amlogic

13:14 <ndufresne> Ely, I've looked up the colorimetry issue, and the answer is that gst currently fails if the drivers returns 0 as colorspace

13:14 <ndufresne> but it also raised some larger issues in gst/v4l2 colorimetry

13:14 <ndufresne> Our mains issue in gst is that v4l2 does not let us enumerate the colorimetry in an efficient way

13:15 <ndufresne> I'll probably just disable colorimetry enumeration and colorimetry matching from now on, I'll leave the code that sets the colorimetry on the driver when available, and simply warn if it miss-match

13:16 <Ely> ndufresne: Thanks for the investigation!

13:16 <Ely> I do the exact same thing as venus, does that mean they'll have the same problem too ?

13:16 <Ely> Basically colorspace is initialized at 0, then it's set by s_fmt, then we return the same one

13:16 <Ely> I'm guessing you're not setting it all the time or something ?

13:17 <ndufresne> no, venus do copy the colorimetry from S_FMT(OUTPUT) to G_FMT(CAPTURE), if you do, it does not work

13:17 <Ely> I already do this in my driver Oo. Wait let me check just in case.

13:17 <ndufresne> but at 4K there is a special case for colorimetry in gst, because we pick a transfer function for BT709 that is extended for 12 bit precision

13:18 <ndufresne> Ely, when I call G_FMT after passing the sps/pps/keyframe, the colorimetry is 0

13:18 <ndufresne> on venue it would be whatever I have passed to S_FMT(OUTPUT)

13:18 <ndufresne> which is as wrong as anything else really

13:19 <ndufresne> colorimetry is a bit unusable in V4L2 api atm

13:21 <Ely> Yeah I already copy colorspace information from OUTPUT to CAPTURE

13:21 <Ely> same code as venus

13:23 <ndufresne> hmm, maybe I'm off on that, the warning of Unknown enum v4l2_colorspace 0 is after Acquired caps: video/x-raw, ...

13:23 <ndufresne> Ely, btw, each time you see "Unknown enum v4l2_colorspace 0", it means you have a driver bug btw

13:23 <ndufresne> this is a totally invalid value, there is a value for "default" that isn't zero

13:24 <Ely> yup maybe I should initialize it to non zero

13:24 <Ely> like 601 or something

13:24 <ndufresne> no, there is a _DEFAULT, when you don't know, this is the right one to set

13:25 <ndufresne> but if the OUTPUT queue have a format, G_FMT/TRY_FMT should always set it to the OUTPUT colorspace

13:26 <Ely> yes that is already done

13:26 <ndufresne> because we trust the containers over the bitstream (for gst, we also parse this from the bitstream)

13:26 <ndufresne> Ely, not on TRY_FMT apparently

13:27 <Ely> Indeed, only on g_fmt

13:27 <Ely> just like venus :D

13:28 <ndufresne> then it probably fails on venus too

13:28 <ndufresne> (the DB820c is so unstable, that I didn't have a chance to do a lot of 4K testing)

13:29 <Ely> V4L2_COLORSPACE_DEFAULT = 0,

13:29 <ndufresne> really, maybe I'm confusing APIs ...

13:30 <ndufresne> I whish I wrote the colorimetry in gst/v4l2, this is confusing me so much ...

13:30 <Ely> hehe

13:31 <ndufresne> sometimes, you receive patches that looks good, and works most of the time, and finally endup with hard to fix issues

13:45 <ndufresne> arg, so confusing, the order of the v4l2 enums are different in the documentation, https://linuxtv.org/downloads/v4l-dvb-apis/uapi/v4l/colorspaces-defs.html

13:45 <ndufresne> and the value of the enum is not printed in the doc ....

13:48 focus_ has quit [Quit: Leaving]

13:56 <Ely> Huh indeed

14:03 <ndufresne> ok, not enough time today unfortunatly, I'll continue this later, I'll fix couple of traces so we get both gst/v4l2 values, everything we TRY/S/G_FMT

14:05 <ndufresne> but overall, we get "bt709" as input, which I believe get read back has something else because the transfer function in gst is different at 4K ...

14:05 <ndufresne> gst/v4l2 is a not a one one match, mostly because of v4l2 legacy "colorspace"

14:07 sputnik__ has quit [Remote host closed the connection]

14:08 <Ely> Alright cool, thanks for taking the time

14:39 Pix has quit [Ping timeout: 240 seconds]

14:57 focus_ has joined #linux-amlogic

14:58 focus has quit [Ping timeout: 240 seconds]

15:40 edcragg has quit [Quit: ZNC - http://znc.in]

15:41 edcragg has joined #linux-amlogic

17:29 vagrantc has joined #linux-amlogic

17:40 trem_ has joined #linux-amlogic

17:40 trem has quit [Read error: Connection reset by peer]

18:07 <ndufresne> Ely, ok, I finally found out where the asymetry came from, it was BT709, if you set that on the v4l2, and read it back, using gst helpers, you get two different results ;-P

18:09 <ndufresne> was a bit myfault, I received V4L2_XFER_FUNC_709, I would always bump it to GST_VIDEO_TRANSFER_BT2020_12 instead of GST_VIDEO_TRANSFER_BT709 in 4K, because they are the same, just that the second is more precise. But to stay symmetric, I need to bump it only when V4L2_COLORSPACE_BT2020

18:10 <ndufresne> now, don't ask me why iPhones uses V4L2_COLORSPACE_REC709 instead of V4L2_COLORSPACE_BT2020 for 4K content, it generally the wrong thing to do ...

18:10 <ndufresne> (unless if a bug on gst parser ...)

18:17 <Ely> ndufresne: cool that you found the root cause!!

18:17 <Ely> I thought 2020 was mostly set with 10-bit hdr while 709 remained for regular 8-bit non hdr

18:17 <ndufresne> ok, that comes from the ISOMP4 demuxer, and there is no support for the BT2020_12 variant added in gst 1.16 ...

18:18 <ndufresne> Ely, yes, and clearly the guy who added these in gst thought it was harmless, but he created a mess ...

18:19 <ndufresne> but it remains that I need to disable all colorimetry probing in gst, cause the enumeration of the colorimetry leads to incomplete list, I'm worried this will keep breaking from time to time ...

18:20 <ndufresne> Ely, but your right, as this is 8bit content, using bt709 is right

18:21 <ndufresne> anyway, got a fix now, will post in 1.14 and master, I can provide you a patch for 1.12 i you need

18:22 <ndufresne> Ely, btw, the number of reference frames is set to 25, is that hardcoded ?

18:23 <Ely> ndufresne: Yes please a 1.12 patch would be nice if not too much trouble

18:23 <Ely> Mmh how do you mean ? in the file ?

18:23 <ndufresne> (it's a one liner ;-P)

18:24 <ndufresne> Ely, well, I notice I have 25 reference (queued) frames in the encoder with that iphone video

18:24 <ndufresne> that's insanely high, H264 should never require more then 16

18:25 <Ely> I'm not sure I understand

18:25 <Ely> Where do you get that number from ?

18:25 <Ely> dmesg ? gst ?

18:25 <Ely> Do you have 25 buffers queued in ?

18:26 * Ely needs more information to compute

18:26 <ndufresne> hmm, it's hard coded to 8, ok the decoder ignores some frames apparently

18:26 <ndufresne> Ely, so gst request 3 capture buffer, the driver returns 8

18:26 <ndufresne> then I queue 25 OUTPUT buffer before the first CAPTURE buffer comes out

18:27 <ndufresne> in gst, it means the reference frames queue is at level 25

18:27 <Ely> Are you doing the output/capture in an async fashion ?

18:27 <ndufresne> oh I see, that's because of you ring buffer ...

18:27 <Ely> in the driver both are decorrelated

18:27 <Ely> you could queue 80 OUTPUT buffers if you want

18:27 <ndufresne> well, probably not in 4K

18:28 <Ely> :D

18:28 <ndufresne> I think it's bad for drivers to do that

18:28 <ndufresne> both chromium and gst will be sluggish, as the internal ref frame queues are designed with small numbers in mimnd

18:28 <ndufresne> you'll also get huge CPU spike at start and over parsing effect

18:29 <ndufresne> Ely, don't you have the DPB size ?

18:29 <Ely> Well my driver just queues and DONEs the buffers as long as there is free space in the VIFIFO

18:29 <Ely> There's nothing wrong with queuing up a lot of data

18:29 <Ely> but I did enhance the way I do that in my latest commits

18:29 <ndufresne> sure, I'm just saying it's super hard for userspace (which in byte-stream are not parsing the stream) to deal with the efficiently

18:30 <Ely> instead of relying on a 24 count semaphore (which is probably why you see 25), I now rely on vififo available space

18:30 <ndufresne> Ely, well, yes, accepting to queue a lot will have serious effect on seeking performance

18:32 <Ely> mh I guess you're right, although this will have to be tested, the decoder is really fast so hopefully it should be negligible

18:32 <ndufresne> this is of course never an issue in trully zero-copy drivers, so MFC and Venus don't have this side effect

18:32 <Ely> otherwise I'll implement back a hard-fixed max buffers

18:32 <ndufresne> the speed of the decoder does not matter, because the threads the dequeues is slowed down by the display

18:33 <ndufresne> so you have an upstream thread the queue at full speed, and downstream thread the is slowed down by synchronization and rendering

18:33 <Ely> but as soon as you seek, then you discard the "left" buffers that have a past timestamp right ?

18:33 <Ely> like, the decoder will return you the 25 buffers it had queued, you discard them, and wait for proper pts

18:34 <ndufresne> yes, but then, each time to start playing back after a seek, you have a huge and large CPU spike

18:34 <ndufresne> instead of parsing a maximum of 16 frames in burst, you endup parsing as you said 80 (maybe the entire file)

18:34 <Ely> because of parsing/copying the bitstream in OUTPUT buffers ?

18:34 <Ely> I see

18:35 <ndufresne> it's parsing and depayloading (which copied most of the time)

18:35 <Ely> The biggest problem I've had so far is that I cannot block in a QBUF. If I do so, both ffmpeg / gst will freeze because somehow they also stop queuing back CAPTURE buffers (which are needed for proper decoder operation)

18:35 <ndufresne> it's also a lot of useless disk IO when playing back from HD

18:36 <ndufresne> Ely, no QBUF is only called after select/poll says it's ok to do so

18:37 <ndufresne> so you don't really block QBUF, you will instead delay the current buffer DONE state until you have a CAPTURE buffer DONE state

18:38 <ndufresne> the result is that when you DPB is full, then input/output runs in lock step, without wasting any resources

18:40 <ndufresne> but for that you need a key information, which is the number of encoded buffers needed to produce first image

18:44 <ndufresne> probably not urgent for now, but something to keep in mind

18:45 <Ely> Not sure about that number, probably n + 1 where n is the number of packets in a PB block

18:46 <Ely> anyway, I'll set a max input queue to 16 since it's the max number of consecutive bframes

18:46 <ndufresne> that's true for H264, should be as you said N+1, so probably 17

18:47 <ndufresne> in MFC, the firmware returns N + 1 already ;-P

18:47 <Ely> MFC is for which chip ?

18:55 <ndufresne> Samsung

18:55 <ndufresne> it's the first M2M codec driver to have been upstreamed in linux

18:55 <ndufresne> (v4l2 m2m)

18:55 <ndufresne> the framework you use didn't exist back then, that's why they are not using it

18:59 <ndufresne> Ely, looks like the image quality is good from this 4K stream

19:00 <ndufresne> though, I got less then 1 fps ;-P

19:01 <ndufresne> to get going, I believe that enabling the display overlay, with scaling and 4K would be very exciting ;-P

19:01 <ndufresne> (or making an m2m driver around ge2d, that could do to ...)

19:01 <Ely> https://github.com/Elyotna/linux/commit/b0d63cd88f4e29755e58cf329a26168e4fed1372

19:02 <Ely> bhaha, yeah no wonder.. color convert + downscale on uncached 4K, ouch :D

19:04 <Ely> narmstrong started some work on the overlay planes, it's coming along! :) https://github.com/superna9999/linux/commits/amlogic/v4.17/drm-overlay

19:05 <Ely> I think we're getting close to a zero copy video pipeline

19:06 <Ely> ge2d M2M would indeed be nice but I don't think it's planned for anytime soon

19:07 <Ely> also we still don't know if we can keep nv12, or if we'll have to add the tiled pixfmt.. It'll all depend on what the display IP can process

19:31 Ntemis has joined #linux-amlogic

19:40 sputnik_ has joined #linux-amlogic

20:34 Ntemis has quit [Remote host closed the connection]

21:19 focus_ has quit [Quit: Leaving]

21:21 focus_ has joined #linux-amlogic

22:12 sputnik__ has joined #linux-amlogic

22:12 trem_ has quit [Quit: Leaving]

22:14 sputnik_ has quit [Ping timeout: 264 seconds]

22:15 sputnik_ has joined #linux-amlogic

22:17 sputnik__ has quit [Ping timeout: 260 seconds]

22:18 sputnik__ has joined #linux-amlogic

22:19 sputnik_ has quit [Ping timeout: 240 seconds]

22:21 sputnik_ has joined #linux-amlogic

22:24 sputnik__ has quit [Ping timeout: 256 seconds]

22:29 sputnik_ has quit [Remote host closed the connection]

22:30 sputnik_ has joined #linux-amlogic

22:39 sputnik__ has joined #linux-amlogic

22:41 sputnik_ has quit [Ping timeout: 260 seconds]

22:42 sputnik_ has joined #linux-amlogic

22:44 sputnik__ has quit [Ping timeout: 268 seconds]

22:45 sputnik__ has joined #linux-amlogic

22:47 sputnik_ has quit [Ping timeout: 260 seconds]

22:47 sputnik__ has quit [Remote host closed the connection]

22:48 sputnik__ has joined #linux-amlogic

23:06 mag has quit [Quit: Bye]

23:07 sputnik__ has quit [Read error: Connection reset by peer]

23:09 mag has joined #linux-amlogic

23:14 sputnik_ has joined #linux-amlogic

23:15 tingoose has quit [Ping timeout: 260 seconds]

23:15 tingoose_ has quit [Ping timeout: 276 seconds]

23:40 <ndufresne> Ely, true, but I think we should keep it, and simply let userspace choose, a bit like CODA driver is doing

23:41 <ndufresne> we could have I420, NV21, NV12 and tiled variants for all of these (even though, we will probably limit the amount of new formats, as it's a lot of work to document and all)

23:42 <ndufresne> I wonder what VPP_POSTBLEND_ENABLE is ...