Alignment and Clustering of Breast Cancer Patients by Longitudinal Treatment History


Longitudinal treatment histories may offer valuable information about clinical practice patterns to the clinical researcher as part of data exploration, cohort identification, or discovery of potentially beneficial or harmful practices in the health care community. We present a novel approach to temporal clustering of patient treatment information based on the semantic similarity of longitudinal histories. Using combined breast cancer registry data from two neighboring health care institutions, we constructed a database of longitudinal treatment histories that included surgical procedures, radiation therapy, chemotherapy, and hormone replacement therapy. We then did pair-wise similarity comparisons of treatment histories, and used the similarity measures to cluster patients with machine learning methods. An evaluation of our results found that patients clustered on stage of breast cancer and type of treatment provided. We propose that this approach can be applied towards identification of similar cohorts, and for discovery of novel or anomalous clinical practice patterns.

Proceedings of the 2011 AMIA Annual Symposium