
- Scikit Image – Introduction
- Scikit Image - Image Processing
- Scikit Image - Numpy Images
- Scikit Image - Image datatypes
- Scikit Image - Using Plugins
- Scikit Image - Image Handlings
- Scikit Image - Reading Images
- Scikit Image - Writing Images
- Scikit Image - Displaying Images
- Scikit Image - Image Collections
- Scikit Image - Image Stack
- Scikit Image - Multi Image
- Scikit Image - Data Visualization
- Scikit Image - Using Matplotlib
- Scikit Image - Using Ploty
- Scikit Image - Using Mayavi
- Scikit Image - Using Napari
- Scikit Image - Color Manipulation
- Scikit Image - Alpha Channel
- Scikit Image - Conversion b/w Color & Gray Values
- Scikit Image - Conversion b/w RGB & HSV
- Scikit Image - Conversion to CIE-LAB Color Space
- Scikit Image - Conversion from CIE-LAB Color Space
- Scikit Image - Conversion to luv Color Space
- Scikit Image - Conversion from luv Color Space
- Scikit Image - Image Inversion
- Scikit Image - Painting Images with Labels
- Scikit Image - Contrast & Exposure
- Scikit Image - Contrast
- Scikit Image - Contrast enhancement
- Scikit Image - Exposure
- Scikit Image - Histogram Matching
- Scikit Image - Histogram Equalization
- Scikit Image - Local Histogram Equalization
- Scikit Image - Tinting gray-scale images
- Scikit Image - Image Transformation
- Scikit Image - Scaling an image
- Scikit Image - Rotating an Image
- Scikit Image - Warping an Image
- Scikit Image - Affine Transform
- Scikit Image - Piecewise Affine Transform
- Scikit Image - ProjectiveTransform
- Scikit Image - EuclideanTransform
- Scikit Image - Radon Transform
- Scikit Image - Line Hough Transform
- Scikit Image - Probabilistic Hough Transform
- Scikit Image - Circular Hough Transforms
- Scikit Image - Elliptical Hough Transforms
- Scikit Image - Polynomial Transform
- Scikit Image - Image Pyramids
- Scikit Image - Pyramid Gaussian Transform
- Scikit Image - Pyramid Laplacian Transform
- Scikit Image - Swirl Transform
- Scikit Image - Morphological Operations
- Scikit Image - Erosion
- Scikit Image - Dilation
- Scikit Image - Black & White Tophat Morphologies
- Scikit Image - Convex Hull
- Scikit Image - Generating footprints
- Scikit Image - Isotopic Dilation & Erosion
- Scikit Image - Isotopic Closing & Opening of an Image
- Scikit Image - Skelitonizing an Image
- Scikit Image - Morphological Thinning
- Scikit Image - Masking an image
- Scikit Image - Area Closing & Opening of an Image
- Scikit Image - Diameter Closing & Opening of an Image
- Scikit Image - Morphological reconstruction of an Image
- Scikit Image - Finding local Maxima
- Scikit Image - Finding local Minima
- Scikit Image - Removing Small Holes from an Image
- Scikit Image - Removing Small Objects from an Image
- Scikit Image - Filters
- Scikit Image - Image Filters
- Scikit Image - Median Filter
- Scikit Image - Mean Filters
- Scikit Image - Morphological gray-level Filters
- Scikit Image - Gabor Filter
- Scikit Image - Gaussian Filter
- Scikit Image - Butterworth Filter
- Scikit Image - Frangi Filter
- Scikit Image - Hessian Filter
- Scikit Image - Meijering Neuriteness Filter
- Scikit Image - Sato Filter
- Scikit Image - Sobel Filter
- Scikit Image - Farid Filter
- Scikit Image - Scharr Filter
- Scikit Image - Unsharp Mask Filter
- Scikit Image - Roberts Cross Operator
- Scikit Image - Lapalace Operator
- Scikit Image - Window Functions With Images
- Scikit Image - Thresholding
- Scikit Image - Applying Threshold
- Scikit Image - Otsu Thresholding
- Scikit Image - Local thresholding
- Scikit Image - Hysteresis Thresholding
- Scikit Image - Li thresholding
- Scikit Image - Multi-Otsu Thresholding
- Scikit Image - Niblack and Sauvola Thresholding
- Scikit Image - Restoring Images
- Scikit Image - Rolling-ball Algorithm
- Scikit Image - Denoising an Image
- Scikit Image - Wavelet Denoising
- Scikit Image - Non-local means denoising for preserving textures
- Scikit Image - Calibrating Denoisers Using J-Invariance
- Scikit Image - Total Variation Denoising
- Scikit Image - Shift-invariant wavelet denoising
- Scikit Image - Image Deconvolution
- Scikit Image - Richardson-Lucy Deconvolution
- Scikit Image - Recover the original from a wrapped phase image
- Scikit Image - Image Inpainting
- Scikit Image - Registering Images
- Scikit Image - Image Registration
- Scikit Image - Masked Normalized Cross-Correlation
- Scikit Image - Registration using optical flow
- Scikit Image - Assemble images with simple image stitching
- Scikit Image - Registration using Polar and Log-Polar
- Scikit Image - Feature Detection
- Scikit Image - Dense DAISY Feature Description
- Scikit Image - Histogram of Oriented Gradients
- Scikit Image - Template Matching
- Scikit Image - CENSURE Feature Detector
- Scikit Image - BRIEF Binary Descriptor
- Scikit Image - SIFT Feature Detector and Descriptor Extractor
- Scikit Image - GLCM Texture Features
- Scikit Image - Shape Index
- Scikit Image - Sliding Window Histogram
- Scikit Image - Finding Contour
- Scikit Image - Texture Classification Using Local Binary Pattern
- Scikit Image - Texture Classification Using Multi-Block Local Binary Pattern
- Scikit Image - Active Contour Model
- Scikit Image - Canny Edge Detection
- Scikit Image - Marching Cubes
- Scikit Image - Foerstner Corner Detection
- Scikit Image - Harris Corner Detection
- Scikit Image - Extracting FAST Corners
- Scikit Image - Shi-Tomasi Corner Detection
- Scikit Image - Haar Like Feature Detection
- Scikit Image - Haar Feature detection of coordinates
- Scikit Image - Hessian matrix
- Scikit Image - ORB feature Detection
- Scikit Image - Additional Concepts
- Scikit Image - Render text onto an image
- Scikit Image - Face detection using a cascade classifier
- Scikit Image - Face classification using Haar-like feature descriptor
- Scikit Image - Visual image comparison
- Scikit Image - Exploring Region Properties With Pandas
Face Classification using Haar-like Feature Descriptor
Haar-like feature descriptors are simple digital image features that are used in digital image processing and object recognition. These descriptors/features play a crucial role in identifying specific patterns or objects within images. They achieve this by analyzing the differences in pixel intensities between adjacent rectangular regions in a detection window, enabling the classification of image subsections.
In this tutorial, we demonstrate the process of extracting, selecting, and classifying Haar-like features to differentiate between faces and non-faces.
Please note that this example relies on scikit-learn for feature selection and classification.
The procedure for extracting Haar-like features from an image is relatively straightforward. First, a region of interest (ROI) is defined. Second, the integral image within this ROI is computed. Finally, the integral image is employed to extract the features.
We use a subset of the CBCL dataset, comprising 100 face images and 100 non-face images. Each image has been resized to an ROI of 19 by 19 pixels. We select 75 images from each group to train a classifier and identify the most important features. The remaining 25 images from each class are used to evaluate the classifier's performance.
To enhance computational efficiency without compromising accuracy, we can train a random forest classifier to identify the most important features, particularly for face classification. The concept is to determine which features are frequently used by the ensemble of trees. By using only the most prominent features in subsequent steps, we can significantly accelerate the computation while preserving accuracy.
Example
The following example demonstrates how to use Haar-like features and a random forest classifier for face classification.
from time import time import numpy as np import matplotlib.pyplot as plt from sklearn.ensemble import RandomForestClassifier from sklearn.model_selection import train_test_split from sklearn.metrics import roc_auc_score from skimage.data import lfw_subset from skimage.transform import integral_image from skimage.feature import haar_like_feature from skimage.feature import haar_like_feature_coord from skimage.feature import draw_haar_like_feature def extract_haar_feature_image(img, feature_type, feature_coord=None): """Calculate Haar-like features for the given image""" integral_img = integral_image(img) return haar_like_feature(integral_img, 0, 0, integral_img.shape[0], integral_img.shape[1], feature_type=feature_type, feature_coord=feature_coord) sample_images = lfw_subset() # For efficiency, extract the two types of features only selected_feature_types = ['type-2-x', 'type-2-y'] # Calculate the features start_time = time() X_features = [extract_haar_feature_image(img, selected_feature_types) for img in sample_images] X_features = np.stack(X_features) elapsed_time_feature_computation = time() - start_time # Label the images (100 faces and 100 non-faces) labels = np.array([1] * 100 + [0] * 100) X_train, X_test, y_train, y_test = train_test_split(X_features, labels, train_size=150, random_state=0, stratify=labels) # Extract all potential features feature_coordinates, feature_types = haar_like_feature_coord(width=sample_images.shape[2], height=sample_images.shape[1], feature_type=selected_feature_types) # Train a random forest classifier and evaluate its performance classifier = RandomForestClassifier(n_estimators=1000, max_depth=None, max_features=100, n_jobs=-1, random_state=0) start_time = time() classifier.fit(X_train, y_train) elapsed_time_training = time() - start_time auc_score_full_features = roc_auc_score(y_test, classifier.predict_proba(X_test)[:, 1]) # Sort features by importance and display the top six sorted_feature_indices = np.argsort(classifier.feature_importances_)[::-1] fig, axes = plt.subplots(3, 2, figsize=(8, 8)) for idx, ax in enumerate(axes.ravel()): image = sample_images[0] image = draw_haar_like_feature(image, 0, 0, sample_images.shape[2], sample_images.shape[1], [feature_coordinates[sorted_feature_indices[idx]]]) ax.imshow(image) ax.set_xticks([]) ax.set_yticks([]) _ = fig.suptitle('The Most Significant Features')
Output

The selection of the most significant features is done by examining the cumulative sum of feature importance. In this example, we retain features that account for 70% of the cumulative value, which corresponds to using only 3% of the total number of features.
Example
Here is an example that demonstrates the process of feature selection based on the cumulative sum of the importance of features.
# Calculate the cumulative sum of feature importances and normalize cdf_feature_importances = np.cumsum(classifier.feature_importances_[sorted_feature_indices]) cdf_feature_importances /= cdf_feature_importances[-1] # Normalize by dividing by the maximum value # Determine the number of significant features accounting for 70% of branch points significant_feature_count = np.count_nonzero(cdf_feature_importances < 0.7) significant_feature_percent = round(significant_feature_count / len(cdf_feature_importances) * 100, 1) print('Significant_feature_count:') print(f'{significant_feature_count} features, or {significant_feature_percent}%, ' f'account for 70% of branch points in the random forest.') print() # Select the most informative features based on the determined count selected_feature_coord = feature_coordinates[sorted_feature_indices[:significant_feature_count]] selected_feature_type = feature_types[sorted_feature_indices[:significant_feature_count]] # Note: You can also directly select features from the X matrix, but we're highlighting the use of `feature_coordinates` and `feature_types` # to recompute a subset of desired features. # Calculate the features using the selected subset start_time = time() X_selected = [ extract_haar_feature_image(img, selected_feature_type, selected_feature_coord) for img in sample_images ] X_selected = np.stack(X_selected) elapsed_time_selected_feature_computation = time() - start_time labels = np.array([1] * 100 + [0] * 100) X_train, X_test, y_train, y_test = train_test_split(X_selected, labels, train_size=150, random_state=0, stratify=labels) start_time = time() classifier.fit(X_train, y_train) elapsed_time_selected_training = time() - start_time auc_selected_features = roc_auc_score(y_test, classifier.predict_proba(X_test)[:, 1]) result_summary = ( f'Computing the full feature set took {elapsed_time_feature_computation:.3f}s, ' f'plus {elapsed_time_training:.3f}s training, for an AUC of {auc_score_full_features:.2f}. ' f'Computing the restricted feature set took {elapsed_time_selected_feature_computation:.3f}s, ' f'plus {elapsed_time_selected_training:.3f}s training, for an AUC of {auc_selected_features:.2f}.' ) print('Summary:') print(result_summary) plt.show()
Output
Significant_feature_count: 712 features, or 0.7%, account for 70% of branch points in the random forest. Summary: Computing the full feature set took 77.978s, plus 3.092s training, for an AUC of 1.00. Computing the restricted feature set took 0.159s, plus 2.488s training, for an AUC of 1.00.