A comprehensive study on automatic non-informative frame detection in colonoscopy videos
Özet
Despite today's developing healthcare technology, conventional colonoscopy is still a gold-standard method to detect colon abnormalities. Due to the folded structure of the intestine and visual disturbances caused by artifacts, it can be hard for specialists to detect abnormalities during the procedure. Frames that include artifacts such as specular reflection, improper contrast levels from insufficient or excessive illumination gastric juice, bubbles, or residuals should be detected to increase an accurate diagnosis rate. In this work, both conventional machine learning and transfer learning methods have been used to detect non-informative frames in colonoscopy videos. The conventional machine learning part consists of 5 different types of texture features, which are gray level co-occurrence matrix (GLCM), gray level run length matrix (GLRLM), neighborhood gray-tone difference matrix (NGTDM), focus measure operators (FMOs), and first-order statistics. In addition to these methods, we utilized 8 different transfer learning models: AlexNet, SqueezeNet, GoogleNet, ShuffleNet, ResNet50, ResNet18, NasNetMobile, and MobileNet. The results showed that FMOs and decision tree combination gave the best accuracy and f-measure values with almost 89% and 0.79%, respectively, for the conventional machine learning part. When the transfer learning part is taken into account, AlexNet (99.85%) and SqueezeNet (98.80%) have the highest performance metric results. This study shows the potential of both transfer learning and conventional machine learning algorithms to provide fast and accurate non-informative frame detection to be used during a colonoscopy, which may be considered the initial step in identifying and classifying colon-related diseases automatically to help guide physicians.