Real-Time Face & Eye Detection with OpenCV

Build a real-time computer vision system using Haar Cascade classifiers to detect faces and eyes in video streams with remarkable speed and accuracy.

View Source Run on Colab

Project Overview

This project implements real-time face and eye detection using OpenCV's pre-trained Haar Cascade classifiers. The system processes video frames from a webcam, detects facial features in milliseconds, and draws bounding boxes around detected faces and eyes. Originally proposed by Viola and Jones in 2001, this approach revolutionized computer vision with its speed and accuracy.

Face Detection

Detect frontal faces with high accuracy

Eye Detection

Locate eyes within detected face regions

Real-Time Processing

Process live webcam feed at 30+ FPS

Optimized Speed

Cascade architecture for fast detection

What are Haar Cascade Classifiers?

Haar Cascade classifiers are machine learning-based object detection methods that use a cascade of boosted classifiers trained on thousands of positive and negative images. The technique was introduced in the seminal 2001 paper "Rapid Object Detection using a Boosted Cascade of Simple Features" by Paul Viola and Michael Jones.

The Cascade Approach

The algorithm employs a "cascade" of increasingly complex classifiers. Each stage quickly rejects non-face regions with minimal computation, allowing the detector to focus computational resources only on promising areas. This multi-stage filtering achieves real-time performance without sacrificing accuracy.

Input Image
Stage 1
(Simple)
Stage 2
Stage 3
...
Stage N
(Complex)
Face Detected

How Haar Features Work

Haar-like features are rectangular patterns that capture intensity differences between adjacent regions. The detector computes the difference between the sum of pixel intensities in white areas versus black areas. These simple features, when combined in a cascade, can detect complex patterns like faces.

Feature Type Pattern What It Detects
Edge Features Two adjacent rectangles Edges like the boundary between forehead and hair
Line Features Three rectangles in a row Lines like the nose bridge or eyebrows
Four-Rectangle Diagonal 2×2 arrangement Diagonal features like eye corners

Machine Learning Training

OpenCV's pre-trained Haar Cascades were trained on thousands of face images (positive samples) and non-face images (negative samples) using the AdaBoost algorithm. This training process selected the most discriminative features and optimal threshold values, creating classifiers that generalize well to new images.

Implementation Steps

  1. Load Pre-trained Cascade Classifiers Import OpenCV and load the XML files containing trained models for face and eye detection.
  2. Initialize Video Capture Open the webcam stream and verify successful connection to the camera device.
  3. Convert to Grayscale Transform each frame to grayscale since Haar detection requires single-channel images.
  4. Detect Faces Apply the face cascade classifier with tuned parameters to locate all faces in the frame.
  5. Detect Eyes within Face Regions For each detected face, search for eyes only within that face's bounding box (region of interest).
  6. Draw Bounding Boxes Overlay rectangles on the original color frame to visualize detections.
  7. Display Results Show the annotated video feed in real-time and handle user input for exit.

Core Implementation Code

1. Import Libraries and Load Cascades

import cv2
import numpy as np

# Load pre-trained Haar Cascade classifiers
face_cascade = cv2.CascadeClassifier(
    cv2.data.haarcascades + 'haarcascade_frontalface_default.xml'
)
eye_cascade = cv2.CascadeClassifier(
    cv2.data.haarcascades + 'haarcascade_eye.xml'
)

# Verify classifiers loaded successfully
if face_cascade.empty() or eye_cascade.empty():
    print("Error: Could not load cascade classifiers")
    exit()

2. Initialize Video Capture

# Open webcam (0 = default camera)
cap = cv2.VideoCapture(0)

# Check if camera opened successfully
if not cap.isOpened():
    print("Error: Could not access camera")
    exit()

print("Camera initialized. Press 'q' to quit.")

3. Main Detection Loop

while True:
    # Capture frame-by-frame
    ret, frame = cap.read()
    if not ret:
        print("Failed to grab frame")
        break

    # Convert to grayscale (required for Haar detection)
    gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)

    # Detect faces in the image
    faces = face_cascade.detectMultiScale(
        gray,
        scaleFactor=1.1,        # Image pyramid scale reduction
        minNeighbors=5,          # Min neighbors for valid detection
        minSize=(30, 30),       # Minimum face size in pixels
        flags=cv2.CASCADE_SCALE_IMAGE
    )

    # Process each detected face
    for (x, y, w, h) in faces:
        # Draw blue rectangle around face
        cv2.rectangle(frame, (x, y), (x+w, y+h), (255, 0, 0), 2)

        # Define region of interest (ROI) for eye detection
        roi_gray = gray[y:y+h, x:x+w]
        roi_color = frame[y:y+h, x:x+w]

        # Detect eyes within the face region
        eyes = eye_cascade.detectMultiScale(
            roi_gray,
            scaleFactor=1.1,
            minNeighbors=10,
            minSize=(20, 20)
        )

        # Draw green rectangles around eyes
        for (ex, ey, ew, eh) in eyes:
            cv2.rectangle(roi_color, (ex, ey), (ex+ew, ey+eh), (0, 255, 0), 2)

    # Display the annotated frame
    cv2.imshow('Face and Eye Detection', frame)

    # Exit on 'q' key press
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

# Release resources
cap.release()
cv2.destroyAllWindows()

Detection Parameters Explained

Understanding the detectMultiScale() parameters is crucial for tuning detection performance:

Parameter Typical Value Effect on Detection
scaleFactor 1.1 – 1.3 Controls image pyramid scaling. Lower values (1.05) = more accurate but slower. Higher values (1.3) = faster but may miss faces.
minNeighbors 3 – 6 Minimum number of neighboring detections required. Higher values reduce false positives but may miss valid faces.
minSize (30, 30) Minimum object size in pixels. Smaller faces/eyes than this threshold are ignored.
maxSize Optional Maximum object size. Useful for filtering out incorrectly detected large regions.

Performance Optimization Tips

Google Colab Implementation

Running this code in Google Colab requires special handling for webcam access since Colab runs in a browser. The notebook uses JavaScript to capture frames from the webcam and transfers them to Python for processing.

Colab Webcam Capture Setup

from IPython.display import display, Javascript, Image
from google.colab.output import eval_js
from base64 import b64decode
import cv2
import numpy as np

def take_photo(filename='photo.jpg', quality=0.8):
    """Capture a frame from webcam in Colab"""
    js = Javascript('''
    async function takePhoto(quality) {
      const div = document.createElement('div');
      const video = document.createElement('video');
      video.style.display = 'block';
      const stream = await navigator.mediaDevices.getUserMedia({video: true});

      document.body.appendChild(div);
      div.appendChild(video);
      video.srcObject = stream;
      await video.play();

      // Resize for efficiency
      google.colab.output.setIframeHeight(document.documentElement.scrollHeight, true);

      const canvas = document.createElement('canvas');
      canvas.width = video.videoWidth;
      canvas.height = video.videoHeight;
      canvas.getContext('2d').drawImage(video, 0, 0);
      stream.getVideoTracks()[0].stop();
      div.remove();
      return canvas.toDataURL('image/jpeg', quality);
    }
    ''')
    display(js)
    data = eval_js('takePhoto({})'.format(quality))
    binary = b64decode(data.split(',')[1])

    with open(filename, 'wb') as f:
        f.write(binary)

    return filename

# Capture and process frame
img_path = take_photo()
frame = cv2.imread(img_path)
gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)

# Apply face detection (same as before)
faces = face_cascade.detectMultiScale(gray, 1.1, 5)
# ... rest of detection code ...

Advanced Enhancements

Once you have the basic detection working, consider these improvements:

Next Steps

Why Haar Cascades Still Matter

While deep learning models like YOLO and R-CNN achieve higher accuracy, Haar Cascades remain relevant because they:

Try the Full Implementation

Run the complete face detection system in your browser with Google Colab, or clone the repository to run locally with your webcam.