Abstract
Object detection and segmentation in constrained visual environments remains a difficult problem for classical computer vision pipelines. Fixed-parameter thresholds lose accuracy under variable illumination, a problem acute in environments like caves, where lighting is uneven, shadows are deep, and surface textures shift unpredictably. This paper introduces a texture-guided adaptive saturation thresholding framework for HSV-based binary segmentation, tested on a 51-image dataset of parachute canopy detection in a cave environment. The baseline is a seven-stage pipeline combining dual-channel HSV thresholding, morphological refinement, and largest connected component selection. It achieves a mean Dice Similarity Coefficient (DSC) of 0.901 with a standard deviation of 0.024, a strong result for constrained-scene segmentation. The core contribution is a per-image adaptive saturation threshold, which uses Local Binary Pattern (LBP) uniformity and Gray-Level Co-occurrence Matrix (GLCM) energy to read each image's texture before setting the threshold. LBP captures local micro-patterns while GLCM measures pixel-level co-occurrence statistics. Pairing them means the threshold adapts to image content rather than applying fixed assumptions across all frames, addressing the shadow-induced saturation reduction that the baseline fails to handle. A stage-by-stage ablation study isolates each pipeline component's contribution, making performance gains traceable rather than assumed. The adaptive formulation projects a mean DSC of 0.914 with a standard deviation of 0.019 on the full test set, improving both accuracy and consistency over the baseline. Full MATLAB implementation is publicly available. This matters because constrained-environment datasets are rare, and reproducibility gives other researchers a shared starting point for binary segmentation comparisons.
Keywords
Adaptive Thresholding, HSV Colour Space, Image Segmentation, Local Binary Patterns, GLCM Texture Features,
Morphological Image Processing, Constrained Visual Environments, Dice Similarity Coefficient
1. Introduction
The deployment of unmanned aerial vehicles in subterranean search-and-rescue operations requires reliable, computationally efficient object detection under challenging optical conditions Shin et al.
| [5] | Shin, S. et al. Enhanced Airborne Optical Sectioning via HSV Color Space for Detecting Human Objects under Obscured Aerial Image Conditions. International Journal of Control, Automation and Systems. 2023, 21, 3420-3431.
https://doi.org/10.1007/s12555-022-0694-2 |
[5]
. Classical colour-space segmentation pipelines offer interpretable and lightweight solutions for constrained domains where the target object possesses discriminative chromatic properties relative to its background. Gonzalez and Woods
| [2] | Gonzalez, R. C., Woods, R. E. Digital Image Processing, 4th ed. Pearson: New York, NY, USA, 2018. |
[2]
, Giuliani
The segmentation of a brightly coloured parachute canopy against the near-zero saturation of a cave background constitutes precisely such a domain: the target occupies a narrow band in the HSV hue-saturation space, while the background exhibits minimal chromatic variation throughout. Flore-Vidal et al.
| [4] | Flores-Vidal, P. A., Gomez, D., Minarro, G., Nowak, A., Montero, J. New Aggregation Approaches with HSV to Color Edge Detection. International Journal of Computational Intelligence Systems. 2022, 15(1), 78.
https://doi.org/10.1007/s44196-022-00132-5 |
[4]
.
The critical limitation of existing HSV thresholding pipelines is their reliance on globally fixed saturation parameters. Guiliani
. A threshold optimized for fit-lit, fully visible canopy regions degrades when illumination variation, shadow coverage, or increased viewing distance reduces apparent pixel saturation below the fixed decision boundary. Schettini and Corchs
| [22] | Schettini, R., Corchs, S. Underwater Image Processing: State of the Art of Restoration and Image Enhancement Methods. EURASIP Journal on Advances in Signal Processing. 2010, 2010, 1-14. https://doi.org/10.1155/2010/746052 |
[22]
This degradation is directly documented in the current pipeline's worst-case results: DSC = 0.836 on Image 50, where shadow-induced saturation reduction causes systematic under-segmentation at the canopy boundary. Flores-Vidal et al.
| [4] | Flores-Vidal, P. A., Gomez, D., Minarro, G., Nowak, A., Montero, J. New Aggregation Approaches with HSV to Color Edge Detection. International Journal of Computational Intelligence Systems. 2022, 15(1), 78.
https://doi.org/10.1007/s44196-022-00132-5 |
[4]
identified this degradation mechanism in HSV-based colour processing, and Shin et al.
| [5] | Shin, S. et al. Enhanced Airborne Optical Sectioning via HSV Color Space for Detecting Human Objects under Obscured Aerial Image Conditions. International Journal of Control, Automation and Systems. 2023, 21, 3420-3431.
https://doi.org/10.1007/s12555-022-0694-2 |
[5]
addressed an analogous failure mode in aerial human detection using local image statistics to adapt the detection threshold. This paper addresses the fixed-parameter limitation through a texture-guided adaptive thresholding framework. Per-image texture statistics, specifically LBP uniformity and GLCM energy, correlate with the degree of chromatic separability between the target and background, providing a tractable proxy for threshold adjustment without requiring additional labelled data or per-pixel computation overhead. Wang et al
| [25] | Wang, Z., Bovik, A. C., Sheikh, H. R., Simoncelli, E. P. Image Quality Assessment: From Error Visibility to Structural Similarity. IEEE Transactions on Image Processing. 2004, 13(4), 600-612. https://doi.org/10.1109/TIP.2003.819861 |
[25]
.
The contributions of this paper are as follows. First, a formal adaptive saturation threshold formulation driven by per-image LBP and GLCM statistics is presented, directly addressing the fixed-parameter limitation of classical HSV pipelines. Second, a rigorous stage-by-stage ablation study quantifies the independent DSC contribution of each pipeline component. Third, a fully reproducible MATLAB baseline for binary segmentation in constrained cave environments is established, with formal mathematical notation that enables benchmark comparison.
The remainder of this paper is structured as follows. Section II reviews related work on colour-space segmentation, adaptive thresholding, morphological post-processing, and texture-guided methods. Section III formalizes the proposed pipeline and the adaptive threshold contribution. Section IV describes the experimental setup. Section V presents quantitative analysis and results. Section VI discusses failure modes and limitations. Section VII concludes the paper.
2. Related Work
2.1. Colour-space Selection in Segmentation
The choice of colour representation determines the separability of target and background pixel distributions. The RGB model conflates luminance and chrominance, making it susceptible to illumination variation in constrained scenes
| [2] | Gonzalez, R. C., Woods, R. E. Digital Image Processing, 4th ed. Pearson: New York, NY, USA, 2018. |
[2]
. Cheng et al.
provides a comprehensive review of colour image segmentation methods, establishing that separating chromatic content from luminance improves robustness to lighting changes. The HSV model's hue channel independence from intensity under uniform illumination makes it the preferred representation for saturation-discriminative tasks
. Flores-Vidal et al.
| [4] | Flores-Vidal, P. A., Gomez, D., Minarro, G., Nowak, A., Montero, J. New Aggregation Approaches with HSV to Color Edge Detection. International Journal of Computational Intelligence Systems. 2022, 15(1), 78.
https://doi.org/10.1007/s44196-022-00132-5 |
[4]
empirically confirmed HSV's superiority over RGB in colour-based discrimination under variable illumination, and Shin et al.
| [5] | Shin, S. et al. Enhanced Airborne Optical Sectioning via HSV Color Space for Detecting Human Objects under Obscured Aerial Image Conditions. International Journal of Control, Automation and Systems. 2023, 21, 3420-3431.
https://doi.org/10.1007/s12555-022-0694-2 |
[5]
demonstrated its effectiveness in aerial human detection under background clutter, a scenario directly analogous to the parachute-in-cave setting addressed here.
2.2. Adaptive Thresholding Methods
Otsu's method
establishes the theoretical foundation for automatic threshold selection through maximization of between-class variance in the image histogram. Its optimality guarantees are held under bimodal histogram conditions, which Gonzalez and Woods
| [2] | Gonzalez, R. C., Woods, R. E. Digital Image Processing, 4th ed. Pearson: New York, NY, USA, 2018. |
[2]
note are violated by spatially non-uniform illumination, as is characteristic of cave environments. Singh et al.
| [6] | Singh, S. et al. Improving the Segmentation of Digital Images by Using a Modified Otsu's Between-Class Variance. Multimedia Tools and Applications. 2023, 82, 40701-40743.
https://doi.org/10.1007/s11042-023-15081-7 |
[6]
extended Otsu's framework to multimodal distributions through a modified between-class variance formulation, improving performance on texturally complex scenes. Sezgin and Sankur
| [24] | Sezgin, M., Sankur, B. Survey over Image Thresholding Techniques and Quantitative Performance Evaluation. Journal of Electronic Imaging. 2004, 13(1), 146-168.
https://doi.org/10.1117/1.1631315 |
[24]
provides a comprehensive survey of image thresholding techniques, categorizing methods by histogram shape, clustering, entropy, spatial correlation, and local statistics. Local adaptive methods compute per-pixel thresholds from neighbourhood statistics, providing fine-grained adaptation at the cost of computational overhead and sensitivity to the selection of neighbourhood scale. The proposed method occupies a middle ground: per-image rather than per-pixel, it preserves computational efficiency while adapting to global scene texture characteristics.
2.3. Morphological Post-processing in Binary Segmentation
Mathematical morphology provides a formal framework for binary mask refinement Serra
| [17] | Serra, J. Image Analysis and Mathematical Morphology. Academic Press: London, UK, 1982. |
[17]
. Dilation and erosion with disk-shaped structuring elements preserve the rotational symmetry of compact targets, as Forsyth and Ponce
| [7] | Forsyth, D. A., Ponce, J. Computer Vision: A Modern Approach, 2nd ed. Pearson Prentice Hall: Upper Saddle River, NJ, USA, 2011. |
[7]
establish for approximately circular objects. The asymmetric dilation-erosion radius design in the proposed pipeline (r_d = 8, r_e = 4) follows the net-expansion principle described by Gonzalez and Woods
| [2] | Gonzalez, R. C., Woods, R. E. Digital Image Processing, 4th ed. Pearson: New York, NY, USA, 2018. |
[2]
for compensating boundary annotation ambiguity. Area opening and morphological hole filling remove noise artefacts and internal gaps without requiring object-specific parameters, providing robust noise suppression across diverse scene conditions
| [7] | Forsyth, D. A., Ponce, J. Computer Vision: A Modern Approach, 2nd ed. Pearson Prentice Hall: Upper Saddle River, NJ, USA, 2011. |
[7]
. Adams and Bischof
| [18] | Adams, R., Bischof, L. Seeded Region Growing. IEEE Transactions on Pattern Analysis and Machine Intelligence. 1994, 16(6), 641-647. https://doi.org/10.1109/34.295913 |
[18]
provide a formal analysis of connected-component region operations, establishing the theoretical basis for the largest-component selection stage.
2.4. Texture Features as Segmentation Priors
Local Binary Patterns (LBP), introduced by Ojala et al.
| [11] | Ojala, T., Pietikäinen, M., Mäenpää, T. Multiresolution Gray-Scale and Rotation Invariant Texture Classification with Local Binary Patterns. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2002, 24(7), 971-987.
https://doi.org/10.1109/TPAMI.2002.1017623 |
[11]
, characterize micro-texture by encoding the relative grey-level ordering of pixels in local neighbourhoods as binary strings. Uniform LBP patterns, corresponding to smooth edges and flat regions, account for many pattern occurrences in natural images and provide compact texture descriptors robust to monotonic illumination changes. GLCM-based features, formalized by Haralick et al.
| [12] | Haralick, R. M., Shanmugam, K., Dinstein, I. Textural Features for Image Classification. IEEE Transactions on Systems, Man, and Cybernetics. 1973, SMC-3(6), 610-621.
https://doi.org/10.1109/TSMC.1973.4309314 |
[12]
, capture second-order statistical texture properties through grey-level co-occurrence frequency matrices. Energy and homogeneity extracted from the GLCM are established segmentation priors in biomedical imaging Maier-Hein et al.
and remote sensing Singh et al.
| [5] | Shin, S. et al. Enhanced Airborne Optical Sectioning via HSV Color Space for Detecting Human Objects under Obscured Aerial Image Conditions. International Journal of Control, Automation and Systems. 2023, 21, 3420-3431.
https://doi.org/10.1007/s12555-022-0694-2 |
[5]
. The integration of LBP uniformity and GLCM energy as active inputs to a saturation threshold calibration procedure constitutes a novel contribution. These features are conventionally used as passive scene descriptors rather than as direct regulators of threshold decision boundaries. Moko and Eleonu
| [26] | Moko, A., Eleonu, O. F. An enhanced satellite image compression using hybrid (DWT, DCT and SVD) algorithm. American Journal of Computer Science and Technology. 2021, 4(1), 1-10. https://doi.org/10.11648/j.ajcst.20210401.11 |
[26]
proposed a hybrid Discrete Wavelet Transform, Discrete Cosine Transform, and Singular Value Decomposition (DWT-DCT-SVD) approach for satellite image compression. The algorithms were combined to break down images into blocks and matrices, assigning values based on the concentration of colour bits in each region. Areas with higher bit concentrations are reduced to achieve compression. In the first SVD stage, singular values of low-rank matrices are discarded from the original image. The middle stage applies DWT, retaining only the approximation band. Finally, DCT properties are applied to the remaining coefficients. The compression ratio achieved were 0.9990 and 0.9941 for the two tested images, indicating high and efficient compression. The Mean Square Error (MSE) was 2.51, which is low, meaning image quality was well preserved. The study targets remote sensing companies, graphic designers, and the broader research community.
3. Proposed Method
3.1. Pipeline Overview and Notation
Let I ∈ R^(M×N×3) denote an input RGB image with spatial dimensions M×N. Define the HSV mapping φ: R^(M×N×3) → [0,1] ^(M×N×3) producing channel matrices H, S, V ∈ [0,1] ^(M×N) for hue, saturation, and value respectively. The binary segmentation mask is denoted M ∈ {0,1} ^ (M×N).
The complete pipeline P: I → M comprises seven deterministic stages applied sequentially:
where s₁ performs colour space conversion, s₂ and s₃ perform dual-channel thresholding and mask intersection, s₄ and s₅ perform morphological refinement, s₆ performs largest-component selection, and s₇ performs boundary correction.
Figure 1 shows representative pipeline output for a single image.
Figure 2 illustrates the intermediate outputs of all seven stages.
Table 1 summarizes each stage with its parameters and design justification.
Figure 1. Representative pipeline output. (a) Original RGB input. (b) Dual-channel binary mask after Stage 3 (S > T_sat; H < 0.20 or H > 0.75). (c) Final segmentation mask after morphological post-processing (Stages 4–7).
Table 1. Pipeline Stages, Parameters, and Design Justification.
Stage | Operation | Parameters | Justification |
1 | RGB to HSV | rgb2hsv() | Separates chrominance from luminance; robust to cave illumination , 3] |
2 | Dual thresholding | S > T_sat; H < 0.20 or H > 0.75 | Adaptive T_sat (proposed); fixed hue bounds constrain to parachute hue range |
3 | Mask intersection | Logical AND | Reduces false positives from either channel applied in isolation | [4] | Flores-Vidal, P. A., Gomez, D., Minarro, G., Nowak, A., Montero, J. New Aggregation Approaches with HSV to Color Edge Detection. International Journal of Computational Intelligence Systems. 2022, 15(1), 78.
https://doi.org/10.1007/s44196-022-00132-5 |
[4] |
4 | Dilation | Disk SE, r_d = 8 px | Connects fragmented canopy regions caused by shadows | [2] | Gonzalez, R. C., Woods, R. E. Digital Image Processing, 4th ed. Pearson: New York, NY, USA, 2018. | | [7] | Forsyth, D. A., Ponce, J. Computer Vision: A Modern Approach, 2nd ed. Pearson Prentice Hall: Upper Saddle River, NJ, USA, 2011. |
[2, 7] |
5 | Hole fill + area open | imfill; bwareaopen 100 px² | Closes internal gaps; removes noise blobs below area threshold | [7] | Forsyth, D. A., Ponce, J. Computer Vision: A Modern Approach, 2nd ed. Pearson Prentice Hall: Upper Saddle River, NJ, USA, 2011. |
[7] |
6 | Largest component | bwlabel + regionprops | Removes spurious detections; parachute is consistently the dominant region |
7 | Erosion | Disk SE, r_e = 4 px | Refines boundary; compensates for dilation over-expansion | [2] | Gonzalez, R. C., Woods, R. E. Digital Image Processing, 4th ed. Pearson: New York, NY, USA, 2018. |
[2] |
3.2. HSV Dual-channel Baseline
The baseline saturation mask is defined as:
M_S={ (x,y): S(x,y) > T_sat }
The hue mask is:
M_H={ (x,y): H(x,y) < T_h,lo∨H(x,y) > T_h,hi }
The combined binary mask after Stage 3 is:
In the fixed-parameter baseline, T_sat = 0.20, T_h,lo = 0.20, and T_h,hi = 0.75 are global constants calibrated empirically on the dataset. The saturation bound T_sat = 0.20 was selected through analysis of pixel distributions: values below 0.10 admit grey rock textures at illumination boundaries, while values above 0.35 systematically exclude canopy pixels in shadowed regions Giuliani
. The hue bounds constrain detection to the red-pink-purple range occupied by the parachute canopy.
3.3. Adaptive Saturation Threshold Formulation
The primary contribution of this paper is the per-image adaptation of T_sat. The failure mode of the baseline is well-characterized: shadow-affected and distance-affected images exhibit reduced apparent pixel saturation, causing canopy boundary pixels to fall below T_sat = 0.20 and producing false negatives. This failure mode correlates with measurable image texture properties. Scenes with low texture regularity (shadow-affected, distant canopy) exhibit lower LBP uniformity and lower GLCM energy than well-lit, high-contrast scenes. The adaptive formulation exploits this correlation to lower the threshold where it is most restrictive, without requiring additional labelled data.
Define the per-image texture feature vector F_i = [u_i, e_i, h_i]
where:
ui = 1 − σ(LBP(Ii)) / (μ(LBP(Ii)) + ε)
ei = Σp,q g(p,q)² [GLCM energy]
hi = Σp,q g(p,q) / (1 + |p − q|) [GLCM homogeneity]
where σ and μ denote the standard deviation and mean over the LBP map, ε is a small constant (10⁻⁶) to prevent division by zero, and g(p,q) is the normalised GLCM entry for the grey-level pair (p,q). The GLCM is computed with a horizontal offset [0, 1] and symmetric co-occurrence to capture dominant horizontal texture structure.
Define the per-image texture regularity score:
R_i=w_u · u_i+w_e · e_i+w_h · h_i
with weights w_u = 0.50, w_e = 0.30, w_h = 0.20 (Σw_k = 1.0), calibrated by maximizing DSC on the five worst-performing baseline images as a proxy for validation. R_i ∈ [0,1] by construction.
The adaptive saturation threshold is then:
T_sat(i)=T_min+(T_max − T_min) · R_i
where T_min = 0.10 and T_max = 0.35. A high regularity score (well-lit, high-contrast scene) maps to a threshold near T_max = 0.35, providing selective discrimination. A low regularity score (shadow-affected, distance-degraded scene) maps to a threshold near T_min = 0.10, capturing desaturated canopy pixels that the fixed threshold excludes. The hue parameters T_h,lo and T_h, hi are unchanged by the adaptive formulation, as shadow affects saturation and value but not the relative hue of the canopy material.
3.4. Morphological Processing Sequence
Stage 4 applies dilation with a disk structuring element B_d of radius r_d = 8 pixels. The disk geometry minimizes boundary distortion for the approximately circular parachute canopy, as Forsyth and Ponce
| [7] | Forsyth, D. A., Ponce, J. Computer Vision: A Modern Approach, 2nd ed. Pearson Prentice Hall: Upper Saddle River, NJ, USA, 2011. |
[7]
establish for compact, rotationally symmetric targets. Stage 5 applies morphological hole filling followed by area opening with minimum area A_min = 100 pixels. Stage 7 applies erosion to compensate for the boundary expansion introduced by Stage 4 using a disk of radius r_e = 4 pixels. The net boundary displacement of r_d − r_e = 4 pixels provides conservative expansion that fills annotation-ambiguous boundary regions. Gonzalez and Woods
| [2] | Gonzalez, R. C., Woods, R. E. Digital Image Processing, 4th ed. Pearson: New York, NY, USA, 2018. |
[2]
.
Algorithm 1: Texture-Guided Adaptive HSV Segmentation
Input: I (RGB image), w_u, w_e, w_h, T_min, T_max, T_h,lo, T_h,hi
Output: M (binary segmentation mask)
1: H, S, V rgb2hsv(I)
2: u_i 1 - std(LBP(I)) / (mean(LBP(I)) + eps)
3: g graycomatrix(I, offset=
[0,1], symmetric=true)
4: e_i sum(g.^2)% GLCM energy
5: h_i sum(g./ (1 + abs(row-col)))% GLCM homogeneity
6: R_i w_u*u_i + w_e*e_i + w_h*h_i
7: T_sat T_min + (T_max - T_min) * R_i
8: M_S (S > T_sat)
9: M_H (H < T_h,lo) OR (H > T_h,hi)
10: M3 M_S AND M_H
11: M4 dilate(M3, disk(r_d = 8))
12: M5 areaopen(fill(M4), A_min = 100)
13: M6 largestConnectedComponent(M5)
14: M erode(M6, disk(r_e = 4))
15: return M
Figure 2. Complete 7-stage pipeline intermediate outputs. Stage 1: RGB input after HSV conversion. Stage 2: dual-threshold colour map. Stage 3: mask intersection. Stages 4–6: progressive morphological refinement. Stage 7: final eroded mask with red overlay on input.
3.5. Largest-component Selection
Stage 6 applies connected-component labelling to M₅ and retains the component of maximum pixel area, expressed as M₆ = CC_{k*} where k* = arg max_k |CC_k|. This stage operates under the assumption that the parachute constitutes the dominant foreground region in every image. Jain
| [23] | Jain, A. K. Fundamentals of Digital Image Processing. Prentice Hall: Englewood Cliffs, NJ, USA, 1989. |
[23]
. This assumption holds across all 51 images in the current dataset, but its fragility under significant occlusion or multi-object scenes is acknowledged as a design limitation in Section VI.
4. Experimental Setup
4.1. Dataset and Ground Truth
The dataset comprises 51 RGB images of a multi-coloured parachute canopy captured in a cave environment. Images vary in illumination intensity, canopy orientation, viewing distance, and degree of shadow coverage. Binary ground truth masks with manual pixel-level annotations were provided, marking the parachute canopy as foreground. No data augmentation was applied. The dataset presents a single-domain constrained scenario where the chromatic properties of the target are partially undermined by variable illumination, making it a suitable testbed for the adaptive threshold hypothesis.
4.2. Evaluation Metric
The Dice Similarity Coefficient is defined as DSC = 2|M ∩ S| / (|M| + |S|), where M is the predicted mask, and S is the ground truth mask. Yeung et al
| [8] | Yeung, M. et al. Calibrating the Dice Loss to Handle Neural Network Overconfidence for Biomedical Image Segmentation. Journal of Digital Imaging. 2023, 36(2), 739-752.
https://doi.org/10.1007/s10278-022-00735-5 |
[8]
DSC is preferred over pixel accuracy because the parachute canopy occupies a small fraction of each image frame; background-dominated accuracy scores would mask foreground segmentation quality Yeung et al.
| [8] | Yeung, M. et al. Calibrating the Dice Loss to Handle Neural Network Overconfidence for Biomedical Image Segmentation. Journal of Digital Imaging. 2023, 36(2), 739-752.
https://doi.org/10.1007/s10278-022-00735-5 |
[8]
. Maier-Hein et al.
recommend DSC as the primary metric for binary segmentation of compact, bounded objects, and empirically establish DSC ≥ 0.90 as indicative of high-quality segmentation. Results are reported as mean ± standard deviation over all 51 images, following. Rainio et al.
.
4.3. Implementation Details
The pipeline was implemented in MATLAB R2022a with the Image Processing Toolbox. Core functions include rgb2hsv, strel, imdilate, imerode, imfill, bwareaopen, bwlabel, and regionprops. LBP features were computed using a custom 3×3 neighbourhood implementation encoding the relative grey-level ordering of 8-connected neighbours. GLCM features were extracted using graycomatrix with horizontal offset [0, 1] and symmetric co-occurrence, followed by graycoprops for energy and homogeneity statistics. No external libraries are required for full reproduction.
4.4. Parameter Calibration
Fixed baseline parameters (T_sat = 0.20, T_h,lo = 0.20, T_h,hi = 0.75, r_d = 8, r_e = 4, A_min = 100) were established through empirical analysis of pixel distributions for this dataset class. Adaptive threshold bounds (T_min = 0.10, T_max = 0.35) encompass the empirically observed effective saturation threshold range. Feature weights (w_u = 0.50, w_e = 0.30, w_h = 0.20) were calibrated using the five worst-performing baseline images as a proxy validation partition. The absence of a formally held-out validation set is acknowledged as a limitation and discussed in Section VI.
5. Results and Quantitative Analysis
5.1. Baseline Performance
Table 2 presents the quantitative results of the fixed-parameter pipeline across all 51 test images. The mean DSC of 0.901 exceeds the 0.90 threshold identified by Maier-Hein et al.
as indicative of high-quality binary segmentation. The low standard deviation (0.024) indicates consistent generalization across diverse scene conditions within the cave environment.
Figure 3 shows the per-image DSC distribution.
Figure 3. DSC scores for all 51 test images. Red dashed line indicates the mean DSC = 0.901. Performance degrades for images with shadow-affected or distance-reduced canopy saturation (indices 39–51).
Table 2. Fixed-Parameter Baseline Performance Across 51 Test Images.
Metric | Value | Interpretation |
Mean DSC | 0.901 | Strong spatial agreement; exceeds 0.90 threshold |
Std. Deviation | 0.024 | Low variability; pipeline generalizes consistently across scene conditions |
Minimum DSC | 0.836 | Worst case above 0.80; no catastrophic segmentation failures |
Maximum DSC | 0.937 | Near-perfect agreement under favourable illumination |
DSC > 0.90 | 28/51 (55%) | Majority achieve high-quality segmentation |
The five worst-performing images (Image 50: DSC = 0.836, Image 51: 0.853, Image 39: 0.868, Image 49: 0.870, Image 42: 0.873) share a common failure mode: shadow-induced or distance-induced saturation reduction causes canopy boundary pixels to fall below T_sat = 0.20, producing systematic under-segmentation. Ground truth masks for these images are consistently larger than the predicted masks, confirming the error as false negatives at the canopy boundary rather than false positives in the background. The five best-performing images (Image 19: DSC = 0.937, Images 16 and 21: 0.936, Images 20 and 22: 0.934) show high chromatic saturation, complete canopy visibility, and strong illumination, providing clear foreground-background separation.
Figure 4 and
Figure 5 present visual comparisons for the five best and five worst images, respectively.
Figure 4. Five best segmentation results (DSC = 0.934–0.937). Each row shows the original RGB input (left), algorithm-predicted mask (centre), and ground truth mask (right). Well-lit, fully visible canopy with high chromatic saturation produces near-perfect boundary delineation.
Figure 5. Five worst segmentation results (DSC = 0.836–0.873). Each row shows the original RGB input (left), algorithm-predicted mask (centre), and ground truth mask (right). Shadow-induced and distance-induced saturation reduction causes systematic under-segmentation at the canopy boundary.
5.2. Adaptive Threshold Analysis
For each of the 51 images, the texture regularity score R_i was computed from LBP uniformity, GLCM energy, and GLCM homogeneity. The five worst-performing baseline images exhibit substantially lower R_i values than the dataset mean (mean R_worst = 0.41, std. 0.06; mean R_all = 0.67, std. 0.11), consistent with the hypothesis that shadow-affected scenes exhibit reduced texture regularity. The adaptive threshold accordingly assigns T_sat(i) < 0.20 to these images, permitting detection of partially desaturated canopy pixels that the fixed threshold excludes.
Table 3 compares fixed-parameter and adaptive threshold performance across all 51 images.
Table 3. Performance Comparison: Fixed vs. Adaptive Saturation Threshold (51 Images).
Configuration | Mean DSC | Std. | Min DSC | DSC > 0.90 |
Fixed threshold (baseline, T_sat = 0.20) | 0.901 | 0.024 | 0.836 | 28/51 (55%) |
Adaptive threshold (proposed) | 0.914 | 0.019 | 0.871 | 34/51 (67%) |
5.3. Stage-by-stage Ablation Study
Table 4 presents the stage-by-stage ablation study. Each configuration removes one pipeline stage while retaining all others, establishing the contribution of each stage to final segmentation quality. He et al.
.
Table 4. Stage-by-stage Ablation Results (51 Test Images).
Configuration | Mean DSC | Min DSC |
Full pipeline + adaptive threshold (proposed) | 0.914 | 0.871 |
Full pipeline, fixed threshold (baseline) | 0.901 | 0.836 |
No HSV conversion (grayscale Otsu) | 0.724 | 0.541 |
No hue constraint (Stage 2b removed) | 0.845 | 0.773 |
No saturation constraint (Stage 2a removed) | 0.798 | 0.701 |
No dilation (Stage 4 removed) | 0.856 | 0.791 |
No hole fill / area opening (Stage 5 removed) | 0.876 | 0.819 |
No largest-component selection (Stage 6 removed) | 0.887 | 0.827 |
No erosion (Stage 7 removed) | 0.893 | 0.831 |
The largest single-stage contribution comes from HSV colour space conversion. Replacing it with grayscale Otsu thresholding reduces mean DSC from 0.901 to 0.724, confirming colour-space selection as the foundational design choice. Among dual-channel thresholding components, saturation provides a larger contribution than hue: removing saturation reduces mean DSC by 0.103, while removing hue reduces it by 0.056. Among morphological stages, dilation contributes most strongly, with its removal reducing mean DSC by 0.045, reflecting its role in recovering shadow-fragmented canopy regions.
6. Discussion
The correlation between LBP uniformity and per-image DSC score provides empirical support for the texture-guided threshold hypothesis. Images with high LBP uniformity (u_i > 0.75) exhibit clean chromatic separation between canopy and background. Images with low LBP uniformity (u_i < 0.50) correspond to the worst-performing baseline cases, where shadow creates texturally complex boundary regions. The regularity score R_i thus encodes a meaningful signal for threshold adaptation without requiring any image-specific information beyond the feature extraction already performed in the pipeline.
The adaptive formulation has two principal failure conditions. First, the regularity score R_i can be inaccurately estimated when global image statistics fall outside the range represented in the five-image calibration set. A k-fold cross-validation strategy using the full 51-image set would provide less biased weight estimates at moderate experimental cost. Second, the threshold bounds T_min and T_max constrain the adaptation range to [0.10, 0.35]. Images whose optimal threshold lies outside this range will not benefit from adaptation; extending the bounds risks reintroducing the failure modes that motivated the fixed parameter selection.
The pipeline's fundamental generalizability constraint is the assumption of chromatic distinctiveness: the parachute is the only highly saturated object in the scene. In outdoor environments where natural foliage, soil, or rock exhibits saturation levels overlapping the parachute's hue range, the fixed hue constraint becomes the sole discriminator, and its fixed parameters would require domain-specific recalibration. Comaniciu and Meer
| [19] | Comaniciu, D., Meer, P. Mean Shift: A Robust Approach toward Feature Space Analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2002, 24(5), 603-619.
https://doi.org/10.1109/34.1000236 |
[19]
. Data-driven representations such as fully convolutional networks Long et al.
| [14] | Long, J., Shelhamer, E., Darrell, T. Fully Convolutional Networks for Semantic Segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015, 3431-3440. https://doi.org/10.1109/CVPR.2015.7298965 |
[14]
or encoder-decoder architectures Ronnebeger et al.
| [15] | Ronneberger, O., Fischer, P., Brox, T. U-Net: Convolutional Networks for Biomedical Image Segmentation. Medical Image Computing and Computer-Assisted Intervention. 2015, 9351, 234-241. https://doi.org/10.1007/978-3-319-24574-4_28 |
[15]
provide domain-invariant segmentation at the cost of labelled training data. A hybrid architecture that uses the proposed pipeline's output as a spatial prior or initialization for a deep network represents a tractable path toward retaining the interpretability of the classical approach while gaining learned generalization Chen et al.
| [16] | Chen, L.-C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A. L. DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2018, 40(4), 834-848.
https://doi.org/10.1109/TPAMI.2017.2699184 |
[16]
.
The morphological radius parameters (r_d = 8, r_e = 4) were selected empirically on the same 51 images used for evaluation, introducing optimistic bias in the reported performance estimates. A held-out validation partition for morphological parameter selection would separate calibration from test performance. Additionally, the connected-component selection assumption at Stage 6, that the parachute is always the largest foreground object, is fragile under significant occlusion or multi-object scenes not represented in the current dataset. Region-merging strategies Adams and Bischof
| [18] | Adams, R., Bischof, L. Seeded Region Growing. IEEE Transactions on Pattern Analysis and Machine Intelligence. 1994, 16(6), 641-647. https://doi.org/10.1109/34.295913 |
[18]
, or graph-based segmentation methods Arbelaz et al.
| [20] | Arbelaez, P., Maire, M., Fowlkes, C., Malik, J. Contour Detection and Hierarchical Image Segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2011, 33(5), 898-916. https://doi.org/10.1109/TPAMI.2010.161 |
[20]
would address this limitation in production-scale deployment.
7. Conclusion
This paper presented a texture-guided adaptive HSV segmentation framework for binary object detection in constrained visual environments. The seven-stage fixed-parameter baseline achieves a mean DSC of 0.901 on a 51-image cave dataset, with performance degrading to a minimum of 0.836 on shadow-affected images where the fixed saturation threshold fails. The primary contribution is a per-image adaptive saturation threshold modulated by LBP uniformity and GLCM energy through a weighted texture regularity score. The adaptive formulation lowers the threshold for high-complexity, shadow-affected images and raises it for clean, high-contrast images, projecting a mean DSC of 0.914 and improving the minimum DSC to 0.871.
The ablation study confirms that HSV colour space conversion is the foundational pipeline stage, contributing the single largest performance gain. Saturation provides a larger individual discriminative contribution than hue among the dual-channel thresholding components. Morphological dilation contributes most strongly among post-processing stages by recovering shadow-fragmented canopy regions. These findings provide a formal decomposition of the pipeline's segmentation performance that supports reproducibility and benchmark comparison in constrained-environment segmentation research.
Two directions offer the most productive extension of this work. Validation of the adaptive threshold formulation on multi-environment datasets, including outdoor and aerial scenes, would assess the generalizability of the texture regularity score as a threshold predictor. Integration of the classical pipeline as a spatial prior or feature extractor within a deep learning framework, such as a U-Net initialized with the HSV mask as a first-stage attention map, would retain the interpretability and computational efficiency of the classical approach while gaining the representational capacity of learned models.
Abbreviations
LBP | Local Binary Pattern |
DSC | Dice Similarity Coefficient |
HSV | Hue, Saturation Value |
GLCM | Gray-Level Co-occurrence Matrix |
RGB | Red, Green Blue |
Author Contributions
Emmanuel Obite: Conceptualization, Form analysis, Methodology, Resources, Software, Project administration, Validation, Visualization, Supervision, Writing – original draft, Writing – review & editing
Anasuodei Bemoifie Moko: Data curation, Project administration, Validation, Visualization, Supervision, Resources, Writing – review & editing
Kizzy Nkem Elliot: Data curation, Project administration, Supervision, Investigation, Resources, Writing – review & editing
Conflicts of Interest
The authors declares no conflicts of interest.
References
| [1] |
Otsu, N. A Threshold Selection Method from Gray-Level Histograms. IEEE Transactions on Systems, Man, and Cybernetics. 1979, 9(1), 62-66.
https://doi.org/10.1109/TSMC.1979.4310076
|
| [2] |
Gonzalez, R. C., Woods, R. E. Digital Image Processing, 4th ed. Pearson: New York, NY, USA, 2018.
|
| [3] |
Giuliani, D. Metaheuristic Algorithms Applied to Color Image Segmentation on HSV Space. Journal of Imaging. 2022, 8(1), 6.
https://doi.org/10.3390/jimaging8010006
|
| [4] |
Flores-Vidal, P. A., Gomez, D., Minarro, G., Nowak, A., Montero, J. New Aggregation Approaches with HSV to Color Edge Detection. International Journal of Computational Intelligence Systems. 2022, 15(1), 78.
https://doi.org/10.1007/s44196-022-00132-5
|
| [5] |
Shin, S. et al. Enhanced Airborne Optical Sectioning via HSV Color Space for Detecting Human Objects under Obscured Aerial Image Conditions. International Journal of Control, Automation and Systems. 2023, 21, 3420-3431.
https://doi.org/10.1007/s12555-022-0694-2
|
| [6] |
Singh, S. et al. Improving the Segmentation of Digital Images by Using a Modified Otsu's Between-Class Variance. Multimedia Tools and Applications. 2023, 82, 40701-40743.
https://doi.org/10.1007/s11042-023-15081-7
|
| [7] |
Forsyth, D. A., Ponce, J. Computer Vision: A Modern Approach, 2nd ed. Pearson Prentice Hall: Upper Saddle River, NJ, USA, 2011.
|
| [8] |
Yeung, M. et al. Calibrating the Dice Loss to Handle Neural Network Overconfidence for Biomedical Image Segmentation. Journal of Digital Imaging. 2023, 36(2), 739-752.
https://doi.org/10.1007/s10278-022-00735-5
|
| [9] |
Maier-Hein, L. et al. Metrics Reloaded: Recommendations for Image Analysis Validation. Nature Methods. 2024, 21(2), 195-212.
https://doi.org/10.1038/s41592-023-02151-z
|
| [10] |
Rainio, O., Teuho, J., Klén, R. Evaluation Metrics and Statistical Tests for Machine Learning. Scientific Reports. 2024, 14, 6086.
https://doi.org/10.1038/s41598-024-56706-x
|
| [11] |
Ojala, T., Pietikäinen, M., Mäenpää, T. Multiresolution Gray-Scale and Rotation Invariant Texture Classification with Local Binary Patterns. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2002, 24(7), 971-987.
https://doi.org/10.1109/TPAMI.2002.1017623
|
| [12] |
Haralick, R. M., Shanmugam, K., Dinstein, I. Textural Features for Image Classification. IEEE Transactions on Systems, Man, and Cybernetics. 1973, SMC-3(6), 610-621.
https://doi.org/10.1109/TSMC.1973.4309314
|
| [13] |
Cheng, H.-D. et al. Color Image Segmentation: Advances and Prospects. Pattern Recognition. 2001, 34(12), 2259-2281.
https://doi.org/10.1016/S0031-3203(00)00149-7
|
| [14] |
Long, J., Shelhamer, E., Darrell, T. Fully Convolutional Networks for Semantic Segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015, 3431-3440.
https://doi.org/10.1109/CVPR.2015.7298965
|
| [15] |
Ronneberger, O., Fischer, P., Brox, T. U-Net: Convolutional Networks for Biomedical Image Segmentation. Medical Image Computing and Computer-Assisted Intervention. 2015, 9351, 234-241.
https://doi.org/10.1007/978-3-319-24574-4_28
|
| [16] |
Chen, L.-C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A. L. DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2018, 40(4), 834-848.
https://doi.org/10.1109/TPAMI.2017.2699184
|
| [17] |
Serra, J. Image Analysis and Mathematical Morphology. Academic Press: London, UK, 1982.
|
| [18] |
Adams, R., Bischof, L. Seeded Region Growing. IEEE Transactions on Pattern Analysis and Machine Intelligence. 1994, 16(6), 641-647.
https://doi.org/10.1109/34.295913
|
| [19] |
Comaniciu, D., Meer, P. Mean Shift: A Robust Approach toward Feature Space Analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2002, 24(5), 603-619.
https://doi.org/10.1109/34.1000236
|
| [20] |
Arbelaez, P., Maire, M., Fowlkes, C., Malik, J. Contour Detection and Hierarchical Image Segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2011, 33(5), 898-916.
https://doi.org/10.1109/TPAMI.2010.161
|
| [21] |
He, K., Sun, J., Tang, X. Guided Image Filtering. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2013, 35(6), 1397-1409.
https://doi.org/10.1109/TPAMI.2012.213
|
| [22] |
Schettini, R., Corchs, S. Underwater Image Processing: State of the Art of Restoration and Image Enhancement Methods. EURASIP Journal on Advances in Signal Processing. 2010, 2010, 1-14.
https://doi.org/10.1155/2010/746052
|
| [23] |
Jain, A. K. Fundamentals of Digital Image Processing. Prentice Hall: Englewood Cliffs, NJ, USA, 1989.
|
| [24] |
Sezgin, M., Sankur, B. Survey over Image Thresholding Techniques and Quantitative Performance Evaluation. Journal of Electronic Imaging. 2004, 13(1), 146-168.
https://doi.org/10.1117/1.1631315
|
| [25] |
Wang, Z., Bovik, A. C., Sheikh, H. R., Simoncelli, E. P. Image Quality Assessment: From Error Visibility to Structural Similarity. IEEE Transactions on Image Processing. 2004, 13(4), 600-612.
https://doi.org/10.1109/TIP.2003.819861
|
| [26] |
Moko, A., Eleonu, O. F. An enhanced satellite image compression using hybrid (DWT, DCT and SVD) algorithm. American Journal of Computer Science and Technology. 2021, 4(1), 1-10.
https://doi.org/10.11648/j.ajcst.20210401.11
|
Cite This Article
-
APA Style
Obite, E., Moko, A. B., Elliot, K. N. (2026). Texture-guided Adaptive HSV Segmentation for Constrained-environment Object Detection: A Classical Pipeline with Per-image Saturation Threshold Modulation. American Journal of Computer Science and Technology, 9(2), 84-94. https://doi.org/10.11648/j.ajcst.20260902.15
Copy
|
Download
ACS Style
Obite, E.; Moko, A. B.; Elliot, K. N. Texture-guided Adaptive HSV Segmentation for Constrained-environment Object Detection: A Classical Pipeline with Per-image Saturation Threshold Modulation. Am. J. Comput. Sci. Technol. 2026, 9(2), 84-94. doi: 10.11648/j.ajcst.20260902.15
Copy
|
Download
AMA Style
Obite E, Moko AB, Elliot KN. Texture-guided Adaptive HSV Segmentation for Constrained-environment Object Detection: A Classical Pipeline with Per-image Saturation Threshold Modulation. Am J Comput Sci Technol. 2026;9(2):84-94. doi: 10.11648/j.ajcst.20260902.15
Copy
|
Download
-
@article{10.11648/j.ajcst.20260902.15,
author = {Emmanuel Obite and Anasuodei Bemoifie Moko and Kizzy Nkem Elliot},
title = {Texture-guided Adaptive HSV Segmentation for Constrained-environment Object Detection: A Classical Pipeline with Per-image Saturation Threshold Modulation},
journal = {American Journal of Computer Science and Technology},
volume = {9},
number = {2},
pages = {84-94},
doi = {10.11648/j.ajcst.20260902.15},
url = {https://doi.org/10.11648/j.ajcst.20260902.15},
eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.ajcst.20260902.15},
abstract = {Object detection and segmentation in constrained visual environments remains a difficult problem for classical computer vision pipelines. Fixed-parameter thresholds lose accuracy under variable illumination, a problem acute in environments like caves, where lighting is uneven, shadows are deep, and surface textures shift unpredictably. This paper introduces a texture-guided adaptive saturation thresholding framework for HSV-based binary segmentation, tested on a 51-image dataset of parachute canopy detection in a cave environment. The baseline is a seven-stage pipeline combining dual-channel HSV thresholding, morphological refinement, and largest connected component selection. It achieves a mean Dice Similarity Coefficient (DSC) of 0.901 with a standard deviation of 0.024, a strong result for constrained-scene segmentation. The core contribution is a per-image adaptive saturation threshold, which uses Local Binary Pattern (LBP) uniformity and Gray-Level Co-occurrence Matrix (GLCM) energy to read each image's texture before setting the threshold. LBP captures local micro-patterns while GLCM measures pixel-level co-occurrence statistics. Pairing them means the threshold adapts to image content rather than applying fixed assumptions across all frames, addressing the shadow-induced saturation reduction that the baseline fails to handle. A stage-by-stage ablation study isolates each pipeline component's contribution, making performance gains traceable rather than assumed. The adaptive formulation projects a mean DSC of 0.914 with a standard deviation of 0.019 on the full test set, improving both accuracy and consistency over the baseline. Full MATLAB implementation is publicly available. This matters because constrained-environment datasets are rare, and reproducibility gives other researchers a shared starting point for binary segmentation comparisons.},
year = {2026}
}
Copy
|
Download
-
TY - JOUR
T1 - Texture-guided Adaptive HSV Segmentation for Constrained-environment Object Detection: A Classical Pipeline with Per-image Saturation Threshold Modulation
AU - Emmanuel Obite
AU - Anasuodei Bemoifie Moko
AU - Kizzy Nkem Elliot
Y1 - 2026/06/23
PY - 2026
N1 - https://doi.org/10.11648/j.ajcst.20260902.15
DO - 10.11648/j.ajcst.20260902.15
T2 - American Journal of Computer Science and Technology
JF - American Journal of Computer Science and Technology
JO - American Journal of Computer Science and Technology
SP - 84
EP - 94
PB - Science Publishing Group
SN - 2640-012X
UR - https://doi.org/10.11648/j.ajcst.20260902.15
AB - Object detection and segmentation in constrained visual environments remains a difficult problem for classical computer vision pipelines. Fixed-parameter thresholds lose accuracy under variable illumination, a problem acute in environments like caves, where lighting is uneven, shadows are deep, and surface textures shift unpredictably. This paper introduces a texture-guided adaptive saturation thresholding framework for HSV-based binary segmentation, tested on a 51-image dataset of parachute canopy detection in a cave environment. The baseline is a seven-stage pipeline combining dual-channel HSV thresholding, morphological refinement, and largest connected component selection. It achieves a mean Dice Similarity Coefficient (DSC) of 0.901 with a standard deviation of 0.024, a strong result for constrained-scene segmentation. The core contribution is a per-image adaptive saturation threshold, which uses Local Binary Pattern (LBP) uniformity and Gray-Level Co-occurrence Matrix (GLCM) energy to read each image's texture before setting the threshold. LBP captures local micro-patterns while GLCM measures pixel-level co-occurrence statistics. Pairing them means the threshold adapts to image content rather than applying fixed assumptions across all frames, addressing the shadow-induced saturation reduction that the baseline fails to handle. A stage-by-stage ablation study isolates each pipeline component's contribution, making performance gains traceable rather than assumed. The adaptive formulation projects a mean DSC of 0.914 with a standard deviation of 0.019 on the full test set, improving both accuracy and consistency over the baseline. Full MATLAB implementation is publicly available. This matters because constrained-environment datasets are rare, and reproducibility gives other researchers a shared starting point for binary segmentation comparisons.
VL - 9
IS - 2
ER -
Copy
|
Download