Computer Vision 101

Computer Vision 101

Ressources

Tools

Image Manipulation: Pillow (Python), OpenCV.

IBM Course

https://learning.edx.org/course/course-v1:IBM+CV0101EN+1T2021/home

Image Manipulation

Filters Algorithm

Gray filter

Why Converting to Grayscale is Better for Edge Detection
- Simpler computation: Grayscale has a single channel (intensity) instead of three (RGB), reducing processing time by ~3x.
- Focus on intensity changes: Edges are primarily defined by intensity changes, not color transitions. Grayscale preserves these structural boundaries.
- Reduced noise sensitivity: Color channels can have different noise profiles. Grayscale combines information from all channels, effectively averaging out some noise.
- Algorithm optimization: Many edge detection algorithms, including Canny, were specifically designed for grayscale images.
- Memory efficiency: Grayscale requires only 1/3 the memory of RGB, important for real-time processing in systems with limited resources.

Hough Lines

Easily detect lines in image, noisy resistant.

watch : https://youtu.be/XRBc_xkZREg?si=8gwaTZNMXrR7qD4b

# 3. Hough Line Transform: Detect lines in the edge map
#    - edges: Input edge image.
#    - 1: Rho resolution (distance accuracy in pixels). Distance from origin (0,0).
#    - np.pi/180: Theta resolution (angle accuracy in radians). 1 degree steps. Angle of the normal vector.
#    - 100: Accumulator threshold. Min number of edge points needed to form a line. CRITICAL tuning parameter.
lines = cv2.HoughLines(edges, 1, np.pi/180, 100)

# 4. Line Interpretation and Drawing
if lines is not None:
    for rho, theta in lines[:1, 0]:  # Process only the first detected line [[rho, theta]]

        # rho: Perpendicular distance from origin (0,0) to the line.
        # theta: Angle of the normal vector from origin to the line (in radians).

        # Calculate cosine and sine of the angle
        a = np.cos(theta)
        b = np.sin(theta)

        # Find a point (x0, y0) on the line (closest point to origin)
        # Based on the line equation: rho = x*cos(theta) + y*sin(theta)
        x0 = a * rho
        y0 = b * rho

        # Calculate two endpoints (x1, y1) and (x2, y2) far outside the image
        # to draw a line segment spanning the entire image width/height.
        # We move along the direction *parallel* to the line from (x0, y0).
        # The vector (-b, a) is perpendicular to (a, b), hence parallel to the line.
        # 1000 is an arbitrary large number to ensure endpoints are off-screen.
        x1 = int(x0 + 1000 * (-b))
        y1 = int(y0 + 1000 * (a))
        x2 = int(x0 - 1000 * (-b))
        y2 = int(y0 - 1000 * (a))

        # Draw the detected line (red color, thickness 2) onto the copy image
        cv2.line(lines_image, (x1, y1), (x2, y2), (0, 0, 255), 2)

Binary Filter

contour_img = np.copy(frame)
_, binary = cv2.threshold(gray, 100, 255, cv2.THRESH_BINARY)

Contour Filter

contours, _ = cv2.findContours(binary, cv2.RETR_LIST, cv2.CHAIN_APPROX_SIMPLE)
for contour in contours:
if cv2.contourArea(contour) > 50:  # Min area filter
cv2.drawContours(contour_img, [contour], 0, (0, 255, 0), 2)

Original Image:          Binary Image:           Detected Contour:
                  
   ⬜⬜⬜⬜⬜             ⬜⬜⬜⬜⬜                ⬜⬜⬜⬜⬜
   ⬜🟦🟦🟧⬜             ⬜⬛⬛⬛⬜                ⬜🟢🟢🟢⬜
   ⬜🟦🟪🟪⬜    →        ⬜⬛⬛⬛⬜       →        ⬜🟢⬛🟢⬜
   ⬜🟪🟧🟧⬜             ⬜⬛⬛⬛⬜                ⬜🟢🟢🟢⬜
   ⬜⬜⬜⬜⬜             ⬜⬜⬜⬜⬜                ⬜⬜⬜⬜⬜

Retrieval Modes (cv2.RETR_*)

These determine which contours to retrieve:

cv2.RETR_EXTERNAL (what you're using):
- Only retrieves the outermost contours
- Ignores holes inside objects
- Like drawing only around the outer edge of a donut, ignoring the hole
cv2.RETR_LIST:
- Retrieves all contours without establishing any hierarchy
- Like drawing around the outer edge of a donut AND its hole, but not relating them
cv2.RETR_TREE:
- Retrieves all contours and organizes them in a hierarchy (parent-child relationships)
- Like drawing around the donut's outer edge and identifying that the hole is "inside" or a "child" of the outer contour
cv2.RETR_CCOMP:
- Retrieves all contours and organizes them in a two-level hierarchy
- All external contours are top level, all contours of holes are second level

Approximation Methods (cv2.CHAIN_APPROX_*)

These determine how the contours are represented:

cv2.CHAIN_APPROX_SIMPLE (what you're using):
- Compresses horizontal, vertical, and diagonal segments, keeping only their end points
- For a square, it would store just 4 points (the corners) instead of all points along the perimeter
- Memory efficient, good for most purposes
cv2.CHAIN_APPROX_NONE:
- Stores absolutely all contour points
- For a straight line, it would store all the pixels of the line
- Uses more memory but preserves complete information
cv2.CHAIN_APPROX_TC89_L1 and cv2.CHAIN_APPROX_TC89_KCOS:
- More advanced contour approximation methods
- Use algorithms to further optimize the contour representation

AI Algorithms

KNN (K-Nearest Neighbors)

KNN is a simple algorithm that classifies data points based on the classes of their nearest neighbors. It is a non-parametric method, meaning it does not make any assumptions about the underlying data distribution. KNN is often used for classification tasks, but it can also be used for regression.

The algorithm works as follows:

Choose the number of neighbors (k).
For each data point, calculate the distance to all other points in the dataset.
Sort the distances and select the k nearest neighbors.