Breast Cancer Prediction from Finite Needle Aspiration Data using ML
Abstract
Cancer is cancer that develops in cells of the breasts. It is the second most common cancer that diagnosed in women after skin cancer. About 1 in 8 women in the US will develop invasive breast cancer in their lifetime. (https://www.cdc.gov/cancer/breast/statistics/index.htm). About 85% of breast cancers occur in women who have no family history of breast cancer. These occur due to genetic mutations that happen because of the aging process and life in general, rather than inherited mutations.
Breast Cancer diagnosis involves biopsy of the sample from the suspected area. Fine needle biopsy is the standard type of biopsy procedure that is followed. It is carried out by removing the tissue from the suspected area using a fine needle. But, many of the cases, the result of the fine needle biopsy alone will not be sufficient to confirm whether the tumor is benign or malignant. The next level of diagnosis involves core needle biopsy or surgical biopsy. There are many drawbacks of core needle and surgical biopsies. These are more invasive techniques and often involves with the chances of infection and bruising. Surgical biopsy is having longer and more uncomfortable recovery time and often the amount of tissue removed can also change the look and feel of the breast.
If we can predict whether a tumor is malignant or benign finite needle aspiration (FNA) data and identify the key attributes, then further complex diagnostic methods like surgical biopsy can be avoided. We have used the data available in Wisconsin breast cancer data set available in Kaggle for the analysis. We are applying standard Machine learning methods such as Logistic regression, Support Vector Machines, Decision trees on the data to get the insights and predict the type of cancer.
Presented and Published in: Proceedings of the Sixth International Conference on Business Analytics and Intelligence, December 2018, IISc, India.