In recent years, polarimetric imaging has been developed for various biological applications, including tissue morphological characterization and cancer-stage detection. However, to facilitate classification models based on the characteristics of polarization states, it is essential to develop a consistent and standardized dataset of polarimetric images.
This study presents a dataset of colorectal cancer polarimetric images designated as ColoPola, which is intended to facilitate research efforts in the field. The dataset consists of 572 sample slices (288 healthy and 284 malignant). For each slice, 36 polarimetric images corresponding to different polarization states are provided. Thus, ColoPola contains 20,592 polarimetric images, of which 10,368 correspond to healthy samples and 10,224 to malignant samples.
The results show that the CNN, CNN_2, EfficientFormerV2, DenseNet, and EfficientNetV2 models obtain F1 scores of 0.870, 0.862, 0.908, 0.903, and 0.965, respectively, on the testing set. Among the five models, EfficientNetV2 achieves the best performance, with all the performance metrics exceeding 0.95 for both the validation set and the testing set. Overall, the results suggest that ColoPola has significant potential as a polarimetric optical imaging-based diagnostic tool for colorectal cancer in clinical practice.