Ola! I have a rasp4, Hq camera and wanting to process ~2000×1500 res images for a realtime HUD
With picam2 the capture array is taking 90ms , and processing taking 600ms using opencv/numpy
Will moving to c++ save me enough time to get this realtime (<80ms)? I realise opencv and np are c++ under the hood and more top-level algorithm optimisation is required, but will the savings be significant enough to warrant a restart? Can i capture images faster somehow in picam2 perhaps, or use a different capture library?
Cheers
Re: Is c++ the only choice for realtime image processing?
Depends what you want to do.
Most image detection stuff is done at around 300 x 300 pixels.
Going bigger is not needed for people/car/bike recog and makes it slower anyway.
You could use movement detection to detect the area of change and then just take a 300 x 300 sample for recog.
Some algorithms use Assembly and NEON or GPU magic;)
The other option is to off load to a "real" image processor like AI/ML Coral.
Most image detection stuff is done at around 300 x 300 pixels.
Going bigger is not needed for people/car/bike recog and makes it slower anyway.
You could use movement detection to detect the area of change and then just take a 300 x 300 sample for recog.
Some algorithms use Assembly and NEON or GPU magic;)
The other option is to off load to a "real" image processor like AI/ML Coral.
I'm dancing on Rainbows.
Raspberries are not Apples or Oranges
Raspberries are not Apples or Oranges
Re: Is c++ the only choice for realtime image processing?
If you're interested in image processing using MATLAB, you can check this out:
https://www.theengineeringprojects.com/ ... ssing.html
MATLAB is good for applications like color detection in live video. It can also be used for motion detection in the real environment.
https://www.theengineeringprojects.com/ ... ssing.html
MATLAB is good for applications like color detection in live video. It can also be used for motion detection in the real environment.
Re: Is c++ the only choice for realtime image processing?
Gavinmc42 wrote: ↑Sat Jan 21, 2023 1:27 amDepends what you want to do.
Most image detection stuff is done at around 300 x 300 pixels.
Going bigger is not needed for people/car/bike recog and makes it slower anyway.
You could use movement detection to detect the area of change and then just take a 300 x 300 sample for recog.
Some algorithms use Assembly and NEON or GPU magic;)
The other option is to off load to a "real" image processor like AI/ML Coral.
I am looking for a specific hierarchy of contours anywhere and any size so unfortunately cannot downsample
Re: Is c++ the only choice for realtime image processing?
2000 x 1500 is a big image for CV.
30 times the usual size
OpenCV is generic, not sure how optimized it is for Pi hardware.
That is going to require something different in Algorithms.
I am not a Vision expert but I try to keep up and probably a only a Jetson or dedicated hardware could do that.
Even then I suspect a Jetson won't be fast enough.
This might help or not?
https://github.com/Idein/py-videocore6
Black and white (Binary image) reduce the data a lot, that's why BNN are faster.
I haven't compiled ARM's Compute or NN Library for some time, but the 64bit OS should give a bit more speed.
Optimized ARM assembly in those libs?
Not sure if C++ is going to be faster than C with assembly.
No idea what is in the Pi5 yet.
I am hoping it might have magic.
The V3 cameras do have PDAF and those can track movement sort of.
Image Sensors with in sensor processing are appearing, that is something to watch for.
for IoT, low power image processing done on the sensor reduces CPU power.
.
30 times the usual size

OpenCV is generic, not sure how optimized it is for Pi hardware.
That is going to require something different in Algorithms.
I am not a Vision expert but I try to keep up and probably a only a Jetson or dedicated hardware could do that.
Even then I suspect a Jetson won't be fast enough.
This might help or not?
https://github.com/Idein/py-videocore6
Black and white (Binary image) reduce the data a lot, that's why BNN are faster.
I haven't compiled ARM's Compute or NN Library for some time, but the 64bit OS should give a bit more speed.
Optimized ARM assembly in those libs?
Not sure if C++ is going to be faster than C with assembly.
No idea what is in the Pi5 yet.
I am hoping it might have magic.
The V3 cameras do have PDAF and those can track movement sort of.
Image Sensors with in sensor processing are appearing, that is something to watch for.
for IoT, low power image processing done on the sensor reduces CPU power.
.
I'm dancing on Rainbows.
Raspberries are not Apples or Oranges
Raspberries are not Apples or Oranges
-
- Posts: 581
- Joined: Sun Dec 03, 2017 1:47 am
- Location: Boston area, MA, USA
Re: Is c++ the only choice for realtime image processing?
It'd take deeper knowledge of OpenCV, your algorithm, and the HW to guess if you could get something like an 8x speedup. I assume you're using Python - the Python part may just be setting up a pipeline and letting it run or handing off pointers and not actually in the way. But if you touch the image data with Python code, you can expect a big speedup rewriting that part in a lower-level, compiled language (C++, C, Rust, asm, or the GPU using Vulkan compute shaders - but definitely multi-threaded if on the CPU).
The speedup you're looking for will likely push the Pi HW to it's limits - it may be worth figuring out what you'd like the HW to be doing, irrespective of the implementation language and libraries.
For example, you'd probably like to be capturing to multiple buffers via DMA, and it should be completely parallel to the image processing pipeline (except for the pressure on the memory system, shared by GPU and CPU components). I'm not sure Python is letting you set up an asynchronous pipeline - the image capture time makes me think you're triggering a capture and waiting for it, the processor going idle. Yet you don't want to be capturing asynchronously so fast you throw away buffers - ask yourself, how do you get that control with the HW, then through what APIs?
OTOH, OpenCV may be doing multiple passes that could be optimized into a single pass algorithm using a lower level language, gaining big speedups.
So, my recommendation is to dig into what's really happening in your current implementation (even setting up a profiler) as well as imagining what you'd like to be happening in your final implementation.
The speedup you're looking for will likely push the Pi HW to it's limits - it may be worth figuring out what you'd like the HW to be doing, irrespective of the implementation language and libraries.
For example, you'd probably like to be capturing to multiple buffers via DMA, and it should be completely parallel to the image processing pipeline (except for the pressure on the memory system, shared by GPU and CPU components). I'm not sure Python is letting you set up an asynchronous pipeline - the image capture time makes me think you're triggering a capture and waiting for it, the processor going idle. Yet you don't want to be capturing asynchronously so fast you throw away buffers - ask yourself, how do you get that control with the HW, then through what APIs?
OTOH, OpenCV may be doing multiple passes that could be optimized into a single pass algorithm using a lower level language, gaining big speedups.
So, my recommendation is to dig into what's really happening in your current implementation (even setting up a profiler) as well as imagining what you'd like to be happening in your final implementation.
Re: Is c++ the only choice for realtime image processing?
Thanks for your kind response, some interesting things to look up.. Yes a Jetson with Nvidia / cuda might be a solution but I guess I have become emotionally attached to the pi at this stage xD. It would be really cool if I could pull mono images from the camera - I assumed the b&w mode was a post-process filter and wouldnt make a difference for speed of acquisition but I should look at that againGavinmc42 wrote: ↑Sat Jan 21, 2023 9:55 am2000 x 1500 is a big image for CV.
30 times the usual size![]()
OpenCV is generic, not sure how optimized it is for Pi hardware.
That is going to require something different in Algorithms.
I am not a Vision expert but I try to keep up and probably a only a Jetson or dedicated hardware could do that.
Even then I suspect a Jetson won't be fast enough.
This might help or not?
https://github.com/Idein/py-videocore6
Black and white (Binary image) reduce the data a lot, that's why BNN are faster.
I haven't compiled ARM's Compute or NN Library for some time, but the 64bit OS should give a bit more speed.
Optimized ARM assembly in those libs?
Not sure if C++ is going to be faster than C with assembly.
No idea what is in the Pi5 yet.
I am hoping it might have magic.
The V3 cameras do have PDAF and those can track movement sort of.
Image Sensors with in sensor processing are appearing, that is something to watch for.
for IoT, low power image processing done on the sensor reduces CPU power.
.
Re: Is c++ the only choice for realtime image processing?
Daniel Gessel wrote: ↑Sat Jan 21, 2023 11:17 amIt'd take deeper knowledge of OpenCV, your algorithm, and the HW to guess if you could get something like an 8x speedup. I assume you're using Python - the Python part may just be setting up a pipeline and letting it run or handing off pointers and not actually in the way. But if you touch the image data with Python code, you can expect a big speedup rewriting that part in a lower-level, compiled language (C++, C, Rust, asm, or the GPU using Vulkan compute shaders - but definitely multi-threaded if on the CPU).
The speedup you're looking for will likely push the Pi HW to it's limits - it may be worth figuring out what you'd like the HW to be doing, irrespective of the implementation language and libraries.
For example, you'd probably like to be capturing to multiple buffers via DMA, and it should be completely parallel to the image processing pipeline (except for the pressure on the memory system, shared by GPU and CPU components). I'm not sure Python is letting you set up an asynchronous pipeline - the image capture time makes me think you're triggering a capture and waiting for it, the processor going idle. Yet you don't want to be capturing asynchronously so fast you throw away buffers - ask yourself, how do you get that control with the HW, then through what APIs?
OTOH, OpenCV may be doing multiple passes that could be optimized into a single pass algorithm using a lower level language, gaining big speedups.
So, my recommendation is to dig into what's really happening in your current implementation (even setting up a profiler) as well as imagining what you'd like to be happening in your final implementation.
wisdom there, python is just grabbing the image and passing it to opencv/numpy (in other words c++) - but I suspect the image object popping in and out of those wrappers rather than a pointer being passed around is some of the issue
thats a great idea about keeping the image grabbing in parallel, I could have a parallel process with shared memory or queues or something and avoid the GIL. I just hoped someone would have done it all before rather than re-invent the wheel
appreciated thanks
- gulshan212
- Posts: 7
- Joined: Wed Nov 23, 2022 9:09 am
- Location: India
Re: Is c++ the only choice for realtime image processing?
Hello this is Gulshan Negi
Well, there are other options also available such as MATLAB, Python, and Java to do so. But C++ becomes the most popular choice due to its speed, efficiency, and low-level control over hardware.
Thanks
Well, there are other options also available such as MATLAB, Python, and Java to do so. But C++ becomes the most popular choice due to its speed, efficiency, and low-level control over hardware.
Thanks
Re: Is c++ the only choice for realtime image processing?
Depending on the Pi4 model, and what your current speed is, you may be able to get 40% - 50% or so extra speed just by over-clocking the Pi4. Its nothing like what your looking for of course, but its free - by which I mean you don't have to make any software changes.Daniel Gessel wrote: ↑Sat Jan 21, 2023 11:17 amThe speedup you're looking for will likely push the Pi HW to it's limits - it may be worth figuring out what you'd like the HW to be doing, irrespective of the implementation language and libraries.
Every little helps

-
- Posts: 581
- Joined: Sun Dec 03, 2017 1:47 am
- Location: Boston area, MA, USA
Re: Is c++ the only choice for realtime image processing?
Absolutely!
My thinking is, when pushing HW to it's limits, the SW architecture and how it maps to that HW is a critical guide to implementation, including programming language choice. The Pi has many distinct resources (CPU, VPU, QPU, DMA - it seems even the 2D display engine has alu that might contribute) all of which might be cleverly used but the existing toolchains are varied. What you want each component to deliver will guide the choice of languages: some parts may be in an interpreted language like Python others a compiled systems language like C, C++ or that new kid on the block, Rust; yet others may be GLSL shaders - and one should expect a smattering of assembly code on the side.
But giving that HW a boost is a win!
Re: Is c++ the only choice for realtime image processing?
I remember Pete Warden years ago talking about how graphics optimizing can be sped up with some assembly.
Are C++ compilers better than hand coded Assembly now?
Any code with recursive loops would be worth checking.
Is there any optimized ARM assembly libs for vision?
Also going to need some way to time loops etc.
What tools are there for that?
Are C++ compilers better than hand coded Assembly now?
Any code with recursive loops would be worth checking.
Is there any optimized ARM assembly libs for vision?
Also going to need some way to time loops etc.
What tools are there for that?
I'm dancing on Rainbows.
Raspberries are not Apples or Oranges
Raspberries are not Apples or Oranges
-
- Posts: 581
- Joined: Sun Dec 03, 2017 1:47 am
- Location: Boston area, MA, USA
Re: Is c++ the only choice for realtime image processing?
Not really sure - been focused on GPU performance. But there's usually something - SIMD instructions that the compiler doesn't leverage completely or some corner cases you can tune for.Gavinmc42 wrote: I remember Pete Warden years ago talking about how graphics optimizing can be sped up with some assembly.
Are C++ compilers better than hand coded Assembly now?
Re: Is c++ the only choice for realtime image processing?
Just remembered Pete Warden does TinyML now, even wrote a book on it.
If he can do vision on little 32bit CPUs, maybe same methods for bigger Pi's will be faster?
https://www.hackster.io/news/pete-warde ... 46e56945a4
If he can do vision on little 32bit CPUs, maybe same methods for bigger Pi's will be faster?
https://www.hackster.io/news/pete-warde ... 46e56945a4
I'm dancing on Rainbows.
Raspberries are not Apples or Oranges
Raspberries are not Apples or Oranges