Automated cancer diagnosis based on histopathological images: a systematic survey

Automated cancer diagnosis based on histopathological images: a systematic survey Çigdem Demir Bülent Yener Clinical decision making Automated cancer diagnosis Biomedical image analysis In traditional cancer diagnosis, pathologists examine biopsies to make diagnostic assessments largely based on cell morphology and tissue distribution. However, this is subjective and often leads to considerable variability. On the other hand, computational diagnostic tools enable objective judgments by making use of quantitative measures. This paper presents a systematic survey of the computational steps in automated cancer diagnosis based on histopathology. These computational steps are: 1.) image preprocessing to determine the focal areas, 2.) feature extraction to quantify the properties of these focal areas, and 3.) classifying the focal areas as malignant or not or identifying their malignancy levels. In Step 1, the focal area determination is usually preceded by noise reduction to improve its success. In the case of cellular-level diagnosis, this step also comprises nucleus/cell segmentation. Step 2 defines appropriate representations of the focal areas that provide distinctive objective measures. In Step 3, automated diagnostic systems that operate on quantitative measures are designed. After the design, this step also estimates the accuracy of the system. In this paper, we detail these computational steps, address their challenges, and discuss the remedies to overcome the challenges, emphasizing the importance of constituting benchmark data sets. Such benchmark data sets allow comparing the different features and system designs and prevent misleading accuracy estimation of the systems. Therefore, this allows determining the subsets of distinguishing features, devise new features, and improve the success of automated cancer diagnosis. Department of Computer Science, Rensselaer Polytechnic Institute, Troy, NY cs-05-09

Automated cancer diagnosis based on histopathological images: a systematic survey

Çigdem Demir

Bülent Yener

Clinical decision making

Automated cancer diagnosis

Biomedical image analysis

In traditional cancer diagnosis, pathologists examine biopsies to make diagnostic assessments largely based on cell morphology and tissue distribution. However, this is subjective and often leads to considerable variability. On the other hand, computational diagnostic tools enable objective judgments by making use of quantitative measures. This paper presents a systematic survey of the computational steps in automated cancer diagnosis based on histopathology. These computational steps are: 1.) image preprocessing to determine the focal areas, 2.) feature extraction to quantify the properties of these focal areas, and 3.) classifying the focal areas as malignant or not or identifying their malignancy levels. In Step 1, the focal area determination is usually preceded by noise reduction to improve its success. In the case of cellular-level diagnosis, this step also comprises nucleus/cell segmentation. Step 2 defines appropriate representations of the focal areas that provide distinctive objective measures. In Step 3, automated diagnostic systems that operate on quantitative measures are designed. After the design, this step also estimates the accuracy of the system. In this paper, we detail these computational steps, address their challenges, and discuss the remedies to overcome the challenges, emphasizing the importance of constituting benchmark data sets. Such benchmark data sets allow comparing the different features and system designs and prevent misleading accuracy estimation of the systems. Therefore, this allows determining the subsets of distinguishing features, devise new features, and improve the success of automated cancer diagnosis.

Department of Computer Science, Rensselaer Polytechnic Institute, Troy, NY

cs-05-09