We use some essential cookies to make our website work.

We use optional cookies, as detailed in our cookie policy, to remember your settings and understand how you use our website.

12.yakir
Posts: 15
Joined: Mon Jan 19, 2015 4:53 am

Read frames from libcamera with segment

Fri Feb 25, 2022 4:04 pm

Hi!
I want to read frames in real-time from a piped libcamera-vid or libcamera-raw using the segment option. I plan to do this in Julia (https://julialang.org), but I imagine that that implementation detail is irrelevant (could just as likely be done in Python).

I might have misunderstood how the segment option works, but it seems very novel and exciting to me that we can read from a pipe that serves the data frame-by-frame without any need to encode things into a png first (as is the custom with a pipe from ffmpeg).

My problem is that I'm not sure if I understood this correctly and if so, what shape and format do these frames come in? How could I, using pseudo-code for instance (or Python or Julia), process the byte-stream coming out of, say:

Code: Select all

libcamera-vid -n --framerate 30 --width 480 --height 640 -t 5000 -o -
to end up with a matrix/container/array of pixel intensities representing the image frame (per frame)?

therealdavidp
Raspberry Pi Engineer & Forum Moderator
Raspberry Pi Engineer & Forum Moderator
Posts: 1566
Joined: Tue Jan 07, 2020 9:15 am

Re: Read frames from libcamera with segment

Fri Feb 25, 2022 4:55 pm

By default you will get h.264 encoded frames. You can get uncompressed YUV420 if you add "--codec yuv420". The hardware has alignment constraints so frames would be delivered with padding on the end of each row, giving you first a Y plane of 512x640, and then U and V planes both 256x320. So that would be 491520 bytes per frame.

You might also be interested in the Picamera2 library, though this is still in development.

12.yakir
Posts: 15
Joined: Mon Jan 19, 2015 4:53 am

Re: Read frames from libcamera with segment

Fri Feb 25, 2022 7:07 pm

That is brilliant, thanks a lot!

Is the relationship between the target row-length (actually frame height I think) (in my case 480) and the hardware row-width (i.e. 512) the smallest n for which 2^n is not smaller than the target row length (so for n = 9, 2^9 == 512 > 480)? So if I want to use different frame dimensions, say 3280x2464, I should read 3280x4096 bytes for the Y plane, and 1640x2048 bytes for each of the uv planes? Or maybe there is a table detailing all these things?

For anyone interested, I managed to reach close to 90 FPS with the following in Julia code:

Code: Select all

w = 480
w2 = nextpow(2, w)
h = 640
fps = 90

cmd = `libcamera-vid -n --framerate $fps --width $w --height $h -t 10000 --codec yuv420 -o -`

function read_frame(o)
    y = read(o, h*w2)
    img = reshape(y, w2, h)
    read(o, h*w2÷2)
    return img[1:w, :]
end

o = open(cmd)
i = 0
t0 = Base.time()
while !eof(o)
    global i
    read_frame(o)
    i += 1
end
t1 = Base.time()
fps2 = i/(t1 - t0)

12.yakir
Posts: 15
Joined: Mon Jan 19, 2015 4:53 am

Re: Read frames from libcamera with segment

Fri Feb 25, 2022 8:49 pm

Is the relationship between the target row-length (actually frame height I think) (in my case 480) and the hardware row-width (i.e. 512) the smallest n for which 2^n is not smaller than the target row length (so for n = 9, 2^9 == 512 > 480)?
No. I've managed to test this with a bunch of dimensions and the results were clearly crazy. So what is the relationship then?

therealdavidp
Raspberry Pi Engineer & Forum Moderator
Raspberry Pi Engineer & Forum Moderator
Posts: 1566
Joined: Tue Jan 07, 2020 9:15 am

Re: Read frames from libcamera with segment

Fri Feb 25, 2022 10:07 pm

The hardware rounds YUV420 image rows up to the next multiple of 64 for the Y plane, and 32 for the U and V planes. Those extra pixels are padding and don't contain image data.

12.yakir
Posts: 15
Joined: Mon Jan 19, 2015 4:53 am

Re: Read frames from libcamera with segment

Sat Feb 26, 2022 12:57 pm

Thanks for the help!

Because the terms rows, columns, and planes within the context of the hardware are easily confused with terms such as width, height, and frame in regards to the images, I'll just run by you the following to make sure I got this correctly:

If I want an image that is 1640 wide and 922 high, then the row length of the Y plane becomes

Code: Select all

64*ceil(1640/64) = 1664
and the column length becomes

Code: Select all

32*ceil(922/32) = 928
The size of the u and v planes are then half of the Y plane, becoming 832 wide and 464 tall each, and thus the size of a single frame is 2316288 bytes. If I'm only interested in the Y plane, then the actual image data in that plane is in the first 1640x922 (1512080) bytes.

EDIT: I think this artifact has to do with how I handle the process I'm piping from rather than anything else.
If all of this is correct, I'm a bit puzzled why I keep getting frames that look like this: Image
Note that this kind of artifact is only present when I use specific image dimensions, but not others (640x480 and 1920x1080 are fine).

I'm curious: up to now I haven't even used the segment option, would that include an extra end-of-frame byte, just a delay (equal to the inverse of the frame-rate) between frames, or something else altogether?
Last edited by 12.yakir on Sun Feb 27, 2022 9:30 am, edited 2 times in total.

12.yakir
Posts: 15
Joined: Mon Jan 19, 2015 4:53 am

Re: Read frames from libcamera with segment

Sat Feb 26, 2022 2:32 pm

Answering this part at least:
up to now I haven't even used the segment option, would that include an extra end-of-frame byte, just a delay (equal to the inverse of the frame-rate) between frames, or something else altogether?
Seems like the segment option is only meant for when the output is auto increasing file names, not piping.

therealdavidp
Raspberry Pi Engineer & Forum Moderator
Raspberry Pi Engineer & Forum Moderator
Posts: 1566
Joined: Tue Jan 07, 2020 9:15 am

Re: Read frames from libcamera with segment

Mon Feb 28, 2022 10:48 am

The rows are rounded up (if necessary) to the next multiple of 64 (for the Y plane) or 32 (U and V planes). This also means that the U and V planes together always add exactly 50% extra to the Y plane. The columns are only rounded up to be even, which they normally are anyway. So a 1640x922 image will occupy 1664x922x3/2 = 2301312 bytes in total.

The "segment" option doesn't add anything to the data stream at all. If you're receiving uncompressed YUV420 then you need to rely on the precise number of bytes in each frame.

12.yakir
Posts: 15
Joined: Mon Jan 19, 2015 4:53 am

Re: Read frames from libcamera with segment

Tue Mar 01, 2022 9:51 am

Firstly, thank you so much for your seemingly endless patience...!

Secondly, I think I get it. Since I really only care about the intensity channel, I'll phrase it like this:
  • The length of the rows in the Y plane is the width rounded up to the next multiple of 64.
  • The length of the columns in the Y plane is the height rounded up to the next multiple of 2.
  • The size of one single frame (Y, u, and v planes together) is 1.5 the size of the Y plane alone.
I tested this with a bunch of different even widths and even heights, odd widths and even heights, and it all worked perfectly. However -- and I'm only asking to make sure I didn't miss something mysterious here -- it failed when I tested this with a bunch of different even and/or odd widths and odd heights. The failure was evident in three manners:
  • when reading all of the bytes from the pipe, the number of bytes was not divisible by the calculated size of the YUV frame (as explained above).
  • Saving a bunch of Y frames showed a visible dark bar on the right (to be clear, as always I not only reshaped the frame following the above, but also cropped out the intended size, as I always do).
  • Saving all of the frames from the pipe showed how the imaged scene shifted to the left with each subsequent frame (and the artifact bar grew in width)
I'd like to add that, for example, the following

Code: Select all

libcamera-vid -n --framerate 10 --width 1640 --height 921 -t 2000 --codec yuv420 -o -
produced exactly 29895424 bytes. This number should be divisible by the number of frames, and the result of that division times two should be divisible by 3 (one frame is composed of one Y plane and two UV planes that are each a quarter of the size of the Y plane, so if we have two of these frames then dividing that by 3 will give the size of the Y plane). It is not. So something fishy is going on here when the height is odd.

Maybe the height should simply never be odd? Maybe there should be a warning/error about that?

Return to “Camera board”