witenite1
Posts: 7
Joined: Mon Jun 10, 2019 6:20 am

low performance from Raspberry Pi CSI camera

Sat Jan 16, 2021 11:38 pm

Hi,
I have been using the Raspberry Pi products for a couple of years now, and recently purchased a Raspberry Pi 4 (8GB model). As a software developer I have written Go code with two independent forks, the first using GoCV (Golang port or wrapper for the latest version of OpenCV) and also a second fork that does not use GoCV, but rather a webcam driver that utilises V4L2. As of this date (January 2021) my entire tool chain is up to date so I am not using any older versions of anything. the Raspberry Pi is running the latest Raspberry OS (32 bit version as it appears the 64Bit version is still not available except in Beta). I have also tried using Ubuntu MATE and the full blown Ubuntu build for the Raspberry Pi, but I think there may be better hardware support (specifically the CSI port) and hence I have reverted to Raspbian 5.4.83-v7l+

My problem is, no matter what I do I am unable to achieve more than about 4-5 frames per second, and with terrible latency. I have run Raspivid and that appears to be running smoothly, however I am pretty sure it is capturing H.264 compressed video, and this is useless to me for a machine vision project (I need lossless video frames). Nowhere have I been able to find any actual Raspberry Pi specification that clearly states what sort of throughput you can expect to get from the camera (Using the v1 OV5647 device, I have a few of them). I don't know whether I have a software issue or perhaps this is simply the Raspberry Pi tapped out and it is not able to serve up more than 5FPS at 1920*1080 or even 640*480 resolution.

I originally installed GUVCView but could not get it to work regardless of what settings I used. It simply shows a black screen, even when I change the Auto Exposure to manual (as suggested by someone online) and/or set Absolute Exposure to 156 (or anything else). I am using the default image format of YUYV 4:2:2.

I have also tried installing qv4l2 (as per this link https://hackernoon.com/polising-raspber ... a-3z113u18) and used lsmod to determine that I needed to

Code: Select all

sudo modprobe bcm2835-v4l2
(note that there was no qv4l2 mod listed when I ran lsmod, so I used this instead, and qv4l2 starts but I cannot get a sensible image. No matter what adjustments I have tried, I simply get a tiny image that I have to keep resizing in the v4l2 test bench, and it shows a nonsensical and highly pixelated image (as if I am zoomed in x1000 into the corner of the image).

The sheer lack of documentation and brevity of online information means that after 2 full days fighting with this, I still have nothing more than a 5FPS and pixelated image in the V4L2 Test Bench as mentioned, or 30FPS of complete black image in my application program, or about 3FPS with huge latency, but good image quality in my Go application. there are plenty webpages and forums telling you how easy it is to get started, and use Raspistill or Raspivid to generate a video, but as soon as you want to get your hands dirty the Raspberry Pi specific information regarding use of the CSI port (or even USB for that matter) dries up rapidly. I have also been consulting more generic information such as the V4L API, but again, without knowing what the actual limitations are for the Raspberry Pi I may be chasing ghosts when it comes to trying to get performance that is simply not possible. Very frustrating, honestly it's like looking up information about car tyres and reading about performance and design of tyres in general, but not knowing whether the tyres I want to buy are only fit for a Ferrari or will they actually work on my Toyota...

Initially I thought it was my non-optimised code, but after spending a couple of days tweaking it and using both a GoCV and a V4L2 approach, both of which are delivering the same bad performance, I am coming to the conclusion that this is a camera setup issue. I have proven now that the majority of the time my code is waiting for the next video frame, and it is not the processing of the frame that is consuming all of the applications resources. As a test I had my application looping around reprocessing the same image, and I can easily achieve several hundred "FPS" but as soon as I enable the code that acquires the image from camera, this performance collapses completely. I have also had to increase GPU memory (in boot config) from 128MB to 256MB, else my program simply locks up when I attempt to process an image of a size approaching 1920x1080.

It's my understanding that the CSI port is directly coupled to the GPU, as opposed to the CPU. Is this correct? It makes sense to me, as then hardware acceleration can be utilized and I would think the GPU to be very capable in terms of rapid image processing, but again, just so little data online to demonstrate this.

Any assistance would be greatly appreciated, especially if there are more knowledgable people out there than me, who are well acquainted with efficient and lossless image capture fro the CSI camera. Thank you

cleverca22
Posts: 5847
Joined: Sat Aug 18, 2012 2:33 pm

Re: low performance from Raspberry Pi CSI camera

Sat Jan 16, 2021 11:44 pm

witenite1 wrote:
Sat Jan 16, 2021 11:38 pm
It's my understanding that the CSI port is directly coupled to the GPU, as opposed to the CPU. Is this correct? It makes sense to me, as then hardware acceleration can be utilized and I would think the GPU to be very capable in terms of rapid image processing, but again, just so little data online to demonstrate this.
i wouldnt say the CSI is tied to the GPU really

in this area, i believe you basically have at least 3 main components being involved
a: the unicam peripheral (the CSI controller)
b: the VPU (where the "gpu" firmware runs)
c: the ARM cores

all 3 of those, then have direct access to ram
when using the linux unicam drivers, the arm directly talks to the unicam peripheral to configure it, then the unicam dumps raw image data directly into ram

but due to licensing/broadcom issues, the ISP (which converts raw to RGB) can only be managed by the VPU, so linux has to fire RPC calls off to it, to request the conversions

if using the v4l wrappers over mmal then the VPU configures the unicam, but it still dumps raw directly into ram, and the ISP still does the conversions

in both cases, the image data winds up in the shared ram, that both arm and vpu have access to
the mmal drivers are just limited to a smaller pool (size set by gpu_mem), while the linux-unicam drivers are limited to whatever is leftover (1024-gpu_mem)

witenite1
Posts: 7
Joined: Mon Jun 10, 2019 6:20 am

Re: low performance from Raspberry Pi CSI camera

Mon Jan 18, 2021 8:52 am

An excellent explanation of the inner Pi workings, thank you for clearing that up. It confirms similar information I have read online regarding Broadcom licensing getting in the way of true optimisation in this regard. It appears that Broadcom more or less shoot themselves in the foot, as the true potential of their SoC can never be realized due to their highly constrained licensing agreement.

Further to your information regarding Unicam, I located some more information here, which I am still digging through:
https://www.raspberrypi.org/documentati ... 2-usage.md

and this detailed PDF document that covers some more information with regard to camera tuning etc:
https://www.raspberrypi.org/documentati ... ra_1p0.pdf

I have placed this information here for reference sake (both for me and others).

It still doesn't quite help resolve my issue though, so hoping somebody can shed some light on my questions regarding anticipated frame rates for non-compressed video frame captures.

cleverca22
Posts: 5847
Joined: Sat Aug 18, 2012 2:33 pm

Re: low performance from Raspberry Pi CSI camera

Mon Jan 18, 2021 7:49 pm

HermannSW wrote:
Mon Jan 18, 2021 10:48 am
Raspiraw can capture 640x64@665fps / 640x75@1007fps raw bayer frames with v1/v2 camera, so less 2ms per complete frame. HQ camera can unfortunately do <300fps only:
Image


I use arducam 25.99$ 0.3MP monochrome global shutter camera for my raspcatbot robot. 320x240@204fps is 5ms per frame, and 1pixel is 1byte, allowing for easy live frame processing. No bad rolling shutter effects.

6by9
Raspberry Pi Engineer & Forum Moderator
Raspberry Pi Engineer & Forum Moderator
Posts: 12835
Joined: Wed Dec 04, 2013 11:27 am
Location: ZZ9 Plural Z Alpha, aka just outside Cambridge.

Re: low performance from Raspberry Pi CSI camera

Mon Jan 18, 2021 9:18 pm

Your base problem is probably down to the bcm2835-v4l2 driver having a concept of preview vs capture. Stills mode ALWAYS runs full resolution off the sensor, so is limited in frame rate, and generally it's limited in buffering too so has to stop and restart the sensor for each capture.

It's fairly obvious for JPEG (stills), H264, MJPEG (video), but YUV or RGB captures are ambiguous. It switches based on resolution, with the default switch point being 1280x720.
Alter the module parameters max_video_width and max_video_height if you want to switch at a different resolution.

bcm2835_v4l2 runs the entire pipeline on the VPU, and is actually more efficient than libcamera and V4L2 as it will handle frames as they are received, whilst V4L2 only ever works with completed frames. When the sensor can only read out at 10fps, this potentially saves you almost 100ms of latency.

Docs explaining how the firmware stack works - https://picamera.readthedocs.io/en/rele ... 3/fov.html
Software Engineer at Raspberry Pi Trading. Views expressed are still personal views.
I'm not interested in doing contracts for bespoke functionality - please don't ask.

witenite1
Posts: 7
Joined: Mon Jun 10, 2019 6:20 am

Re: low performance from Raspberry Pi CSI camera

Wed Jan 20, 2021 9:14 am

6by9 wrote:
Mon Jan 18, 2021 9:18 pm
It's fairly obvious for JPEG (stills), H264, MJPEG (video), but YUV or RGB captures are ambiguous. It switches based on resolution, with the default switch point being 1280x720.
Switches what? Do you mean it switches between still mode and capture mode? Are you suggesting that I force it into capture mode?

I was reading about the still, video and preview mode. I can see the sense in preview mode, but is it still running in preview when it is streaming video? If so, doesn't this cause unnecessary use of VPU resources (and resultant power loss, and more importantly, frame loss?)

Do I understand you correctly when you say the bcm2835_v4l2 module I have selected runs the entire pipeline on the VPU, that I have selected the right module or tool for the job?

With regard to the graph you posted earlier with the high frame rates, is this when outputting a raw bayer image, or is this inclusive of some processing that outputs the YUYV 4:2:2 image? I may be confused here, and perhaps the YUYV is in fact the bayer image. If it is, that's not a problem, however if it isn't it means there may be more post processing I have to do to parse the image into something useable by my application.

Thanks again for your patience. this hurdle really has me stumped at this point, but based on your prior information it appears that I should be able to achieve far higher frame rates than what I am getting.

6by9
Raspberry Pi Engineer & Forum Moderator
Raspberry Pi Engineer & Forum Moderator
Posts: 12835
Joined: Wed Dec 04, 2013 11:27 am
Location: ZZ9 Plural Z Alpha, aka just outside Cambridge.

Re: low performance from Raspberry Pi CSI camera

Wed Jan 20, 2021 11:20 am

witenite1 wrote:
Wed Jan 20, 2021 9:14 am
6by9 wrote:
Mon Jan 18, 2021 9:18 pm
It's fairly obvious for JPEG (stills), H264, MJPEG (video), but YUV or RGB captures are ambiguous. It switches based on resolution, with the default switch point being 1280x720.
Switches what? Do you mean it switches between still mode and capture mode? Are you suggesting that I force it into capture mode?
Switches between video and stills modes.
Stills mode always uses the full res mode off the sensor, and applies a more complex denoise algorithm, therefore it is slower than video mode.
witenite1 wrote:I was reading about the still, video and preview mode. I can see the sense in preview mode, but is it still running in preview when it is streaming video? If so, doesn't this cause unnecessary use of VPU resources (and resultant power loss, and more importantly, frame loss?)
The underlying component on the VPU has 3 ports - preview, video capture, and stills capture.
The V4L2 driver uses video capture or stills capture to generate output frames for the client. It can use the preview port for rendering direct to the screen (using the overlay ioctls), but that is not required.
witenite1 wrote:Do I understand you correctly when you say the bcm2835_v4l2 module I have selected runs the entire pipeline on the VPU, that I have selected the right module or tool for the job?
Yes.
witenite1 wrote:With regard to the graph you posted earlier with the high frame rates, is this when outputting a raw bayer image, or is this inclusive of some processing that outputs the YUYV 4:2:2 image? I may be confused here, and perhaps the YUYV is in fact the bayer image. If it is, that's not a problem, however if it isn't it means there may be more post processing I have to do to parse the image into something useable by my application.
Not my graphs.
Software Engineer at Raspberry Pi Trading. Views expressed are still personal views.
I'm not interested in doing contracts for bespoke functionality - please don't ask.

witenite1
Posts: 7
Joined: Mon Jun 10, 2019 6:20 am

Re: low performance from Raspberry Pi CSI camera

Thu Jan 21, 2021 8:19 am

Thanks again for the help. I have one other question:

I have been looking again tonight at the Picamera API/settings etc. and am wondering whether these settings are global/permanent or not. What I mean is, if I used say a python script to set settings as I need them, and then use my Go program to acquire the images, will the Go program be making acquisitions as per the settings from the Python program? I am hoping so, as then I can use a script to setup the camera how I need it, and then let my Go program go to work acquiring and processing the images. A pity there isn't a direct Golang API for the camera, but I think this is not too much of an issue as long as the settings are "sticky" as set by Python.

6by9
Raspberry Pi Engineer & Forum Moderator
Raspberry Pi Engineer & Forum Moderator
Posts: 12835
Joined: Wed Dec 04, 2013 11:27 am
Location: ZZ9 Plural Z Alpha, aka just outside Cambridge.

Re: low performance from Raspberry Pi CSI camera

Thu Jan 21, 2021 2:08 pm

witenite1 wrote:
Thu Jan 21, 2021 8:19 am
I have been looking again tonight at the Picamera API/settings etc. and am wondering whether these settings are global/permanent or not. What I mean is, if I used say a python script to set settings as I need them, and then use my Go program to acquire the images, will the Go program be making acquisitions as per the settings from the Python program? I am hoping so, as then I can use a script to setup the camera how I need it, and then let my Go program go to work acquiring and processing the images. A pity there isn't a direct Golang API for the camera, but I think this is not too much of an issue as long as the settings are "sticky" as set by Python.
No, each and every instance of the camera component is independent, although only one can be active at a time.
The one exception is that the V4L2 driver uses a single instance of the camera component regardless of how many V4L2 clients exist, so settings applied by any client will apply to all.
Software Engineer at Raspberry Pi Trading. Views expressed are still personal views.
I'm not interested in doing contracts for bespoke functionality - please don't ask.

Return to “Camera board”