Why Jupyter Notebooks Are My Favorite OpenCV Playground

Article

Why Jupyter Notebooks Are My Favorite OpenCV Playground

May 22, 2024

By Ben Jacques|Director of Engineering

Introduction

OpenCV (https://opencv.org/) is my GOTO framework for image processing these days. I spent many years working in a proprietary framework, which was great. However, for most of the projects I work on now, it doesn’t make sense to pay high licensing fees when there is a good open source alternative. OpenCV is versatile. It’s robust. It’s well-documented. It’s also sometimes pretty hard to work with when you are experimenting. The main reason for this is that there really isn’t a good GUI-based development environment for it. For computer vision applications, being able to visualize what is happening when you are developing an algorithm is invaluable.

This is where Jupyter Notebooks (https://jupyter.org/) have helped bridge that gap for me. It provides a code-first environment while still giving me the visual tools I need to make my life easier. It also provides the ability to directly include better documentation in Markdown such as images, graphs, or other visuals that can be really helpful for understanding a computer vision problem. For example, when I was using the CIELAB color space, I included a graphic of how it relates to the RGB color space for clarity.

What is OpenCV?

For context, OpenCV (Open Source Computer Vision Library) is an open-source computer vision and machine learning software library. It contains more than 2,500 optimized algorithms for a wide range of computer vision tasks, including object detection, facial recognition, image processing, and more. It is officially supported in C++ and Python, but has community supported wrappers for many other common languages.

Jupyter Specifics

There are a lot of ways to work with Jupyter Notebooks. You can run them in a web browser, through JupyterLab, in an environment like AWS SageMaker, and others. Though the approaches I’ll detail below work in all the places I’ve tried, my favorite solution is to run Jupyter in VS Code (https://code.visualstudio.com/docs/datascience/jupyter-notebooks). This allows me to stay right within my familiar IDE as well as leverage tools like Github Co-pilot, linting, and source control.

Python Environment

Similar to Jupyter, there are a lot of ways to set up your python environment with endless blog posts, tools, and approaches. Feel free to explore those and do what works for you. I personally have been using Poetry (https://python-poetry.org/) for setting up my environment and handling dependency management. This is a great starting point. Too be honest though, if I’m just doing some skunk-works playground testing, I sometimes cheat and pip install things directly in Jupyter cells to save some time using shell commands (https://coderefinery.github.io/jupyter/extra-features/#shell-commands). For instance, if I want to install `timeit` to see how fast something runs, I can just use:
` ` `
!pip install timeit
import timeit
` ` `
I wouldn’t recommend this in every case, but it is an easy way to save some time and complexity. The ability to run shell commands directly from Jupyter has been pretty awesome and worth checking out.

Interactive Widgets

Once you are up and developing, then the challenges really begin. A few years back I stumbled upon interactive widgets (https://ipywidgets.readthedocs.io/en/latest/examples/Using%20Interact.html). This was a game changer for me and it helped solidify Jupyter as my favorite environment for OpenCV. Let’s start with a common computer vision use case. You have a single-channel image and you want to use a static threshold to find bright things. You have some options:

You could manually type in some thresholds in your cell and rerun the code over and over again until you get it right.
You could use cv2.imshow() with some mouse cursor feedback (that you have to write) to see what values exist and make an educated guess.
You could write some code to generate a histogram of the values and make an educated guess.
Or…you could really quickly try all the values until it visually looks good.

The code to make an interactive slider that updates in real time for thresholding an image is shown below. There are a couple of helper functions I wrote for making reading and displaying images cleaner. The rest is just OpenCV functions and widgets.

The resulting output cell is shown below. It @interact decorator creates a slider for the lower and upper thresholds with a lower bound of 0 and an upper bound of 255 with a step size of 1. Setting the “file” parameter to a list of images in the given directory creates a drop down that lets you select an image you want to test different images without changing any of the code.

To use the widgets, you simply install ipywidgets with `pip install ipywidgets` and then import it in your notebook with `from ipywidgets import interact`. From there you can add the @interact decorator to any function to automatically turn your cell output into something interactive. There are many ways to customize how the interactions and layouts work, but I’ve mostly just used the out-of-the-box elements for most of what I need to.

One of the challenges with OpenCV is that many of the functions have non-intuitive parameters that take a lot of reading to understand. Sometimes, it is a lot easier to just give it a go to see if it accomplishes what you want before going back and understanding it at a deeper level. I have found the boolean selectors and number sliders to be incredibly useful in these cases. Here is a a simple example of this using the OpenCV adaptive_threshold (https://docs.opencv.org/4.x/d7/d1b/group__imgproc__misc.html#ga72b913f352e4a1b1b397736707afcde3) function:

This gives a really simple way to visually see how parameters like “block_size” or “c” impact the output of the operation. You can also get more complex and add dropdowns for the enums as well, but I tend to find those easier to just change in code unless this is something I’m going to come back and use regularly.

One of the great things about the flexibility of the interactive elements is that you can add whatever you want to the output. This could include things like images, text, or plots. The combination of interactive images with a matplotlib output has proven to be useful on quite a number of occasions. For instance, you could apply a threshold technique, find contours with specific parameters, and then plot a histogram of the areas within those contours. That could be useful if you were looking for pepperonis on a pizza that weren’t complete circles. Quickly seeing the outliers in a histogram might give you a good idea of where you would want to set an area limit for a suspect_pepperoni_finder algorithm.

Other Development Bonuses

Python Tooling

Another nice thing about being in a Jupyter Notebook already is that the Python ecosystem gives you access to a ton of other tools to play with. For instance, I was working on a problem last week where I wanted to test out Amazon’s Rekognition interface on some images I was already working with in my Jupyter playground. With a few lines of code, I was able to import the boto3 client (see where I cheated with the shell command?):
` ` `
!pip install boto3
import boto3
` ` `
Now, with a couple of helper classes that I borrowed directly from AWS examples, I had a working image classification test that I could try without leaving the comfort of the code base I was already in.

Github Support

One other bonus is that Github supports rendering of .ipynb files directly in their web interface. This is really handy if you want to see the outputs of cells or look at what the output was at a specific commit. This was a major upgrade from the previous way to view the files on Github which was just a raw mess of JSON. Below is a screenshot of my OpenCV test repo notebook directly in Github’s website.

Machine Learning Tooling

One final bonus of working with Python in Jupyter Notebooks for computer vision is that it makes it really easy to integrate with other machine learning frameworks and tools. Python is still the king in the ML world, so being able to do your image pre-processing right alongside any model development you are working on just makes my life easier. It is rare that I’ve worked on a machine learning model in the vision space where I didn’t need to do some level of pre or post processing to get images in the format I need or reduce them down to a more manageable size. I’ve also been able to use OpenCV to assist in some arduous image labeling tasks such as extracting and segmenting foreground objects before asking someone to label them.

Conclusion

Given all the benefits and the lack of decent alternatives, I just keep coming back to Jupyter Notebooks for my OpenCV development needs. The Python ecosystem, despite its shortcomings, is incredibly powerful and quick. The integration of my existing tooling and the rest of the ML ecosystem make it easier for me, but also makes it easier for new developers to break into computer vision development with less of a learning curve. There are some days I miss my purpose-built IDE for computer vision development, but I’ve found ways to be nearly as productive without the steep fees and painful learning curve. If you are interested in learning more, don’t hesitate to reach out to me by email or on LinkedIn.

Start your
IoT journey.

SpinDance