Unsupervised machine learning of integrated health and social care data from the Macmillan Improving the Cancer Journey service in Glasgow

Kean Lee Kang, Margaret Greer, James L. Bown, Janice Preston, Judith Mabelis, Leigh-Anne Hepburn, Miriam Fisher, Ruth E. Falconer, Sandra McDermott, Stuart Deed

Research output: Contribution to journalMeeting Abstract

18 Downloads (Pure)

Abstract

Background: Improving the Cancer Journey (ICJ) was launched in 2014 by Glasgow City Council and Macmillan Cancer Support. As part of routine service, data is collected on ICJ users including demographic and health information, results from holistic needs assessments and quality of life scores as measured by EQ-5D health status. There is also data on the number and type of referrals made and feedback from users on the overall service. By applying artificial intelligence and interactive visualization technologies to this data, we seek to improve service provision and optimize resource allocation.

Method: An unsupervised machine-learning algorithm was deployed to cluster the data. The classical k-means algorithm was extended with the k-modes technique for categorical data, and the gap heuristic automatically identified the number of clusters. The resulting clusters are used to summarize complex data sets and produce three-dimensional visualizations of the data landscape. Furthermore, the traits of new ICJ clients are predicted by approximately matching their details to the nearest existing cluster center.

Results: Cross-validation showed the model’s effectiveness over a wide range of traits. For example, the model can predict marital status, employment status and housing type with an accuracy between 2.4 to 4.8 times greater than random selection. One of the most interesting preliminary findings is that area deprivation (measured through Scottish Index of Multiple Deprivation-SIMD) is a better predictor of an ICJ client’s needs than primary diagnosis (cancer type).

Conclusion: A key strength of this system is its ability to rapidly ingest new data on its own and derive new predictions from those data. This means the model can guide service provision by forecasting demand based on actual or hypothesized data. The aim is to provide intelligent person-centered recommendations. The machine-learning model described here is part of a prototype software tool currently under development for use by the cancer support community.

Disclosure: Funded by Macmillan Cancer Support

Original languageEnglish
Article number4
Number of pages1
JournalBritish Journal of Cancer
Volume119
DOIs
Publication statusPublished - 8 Nov 2018
Event2018 NCRI Cancer Conference - SEC Centre, Glasgow, United Kingdom
Duration: 4 Nov 20186 Nov 2018
http://conference.ncri.org.uk/wp-content/uploads/2018/11/2018-Programme-PDF-version.pdf

Fingerprint

Delivery of Health Care
Neoplasms
Aptitude
Needs Assessment
Resource Allocation
Unsupervised Machine Learning
Artificial Intelligence
Disclosure
Marital Status
Health Status
Referral and Consultation
Software
Quality of Life
Demography
Technology
Health

Cite this

Kang, Kean Lee ; Greer, Margaret ; Bown, James L. ; Preston, Janice ; Mabelis, Judith ; Hepburn, Leigh-Anne ; Fisher, Miriam ; Falconer, Ruth E. ; McDermott, Sandra ; Deed, Stuart. / Unsupervised machine learning of integrated health and social care data from the Macmillan Improving the Cancer Journey service in Glasgow. In: British Journal of Cancer. 2018 ; Vol. 119.
@article{67717d42b6fc4221a53e432c8085b5f8,
title = "Unsupervised machine learning of integrated health and social care data from the Macmillan Improving the Cancer Journey service in Glasgow",
abstract = "Background: Improving the Cancer Journey (ICJ) was launched in 2014 by Glasgow City Council and Macmillan Cancer Support. As part of routine service, data is collected on ICJ users including demographic and health information, results from holistic needs assessments and quality of life scores as measured by EQ-5D health status. There is also data on the number and type of referrals made and feedback from users on the overall service. By applying artificial intelligence and interactive visualization technologies to this data, we seek to improve service provision and optimize resource allocation.Method: An unsupervised machine-learning algorithm was deployed to cluster the data. The classical k-means algorithm was extended with the k-modes technique for categorical data, and the gap heuristic automatically identified the number of clusters. The resulting clusters are used to summarize complex data sets and produce three-dimensional visualizations of the data landscape. Furthermore, the traits of new ICJ clients are predicted by approximately matching their details to the nearest existing cluster center.Results: Cross-validation showed the model’s effectiveness over a wide range of traits. For example, the model can predict marital status, employment status and housing type with an accuracy between 2.4 to 4.8 times greater than random selection. One of the most interesting preliminary findings is that area deprivation (measured through Scottish Index of Multiple Deprivation-SIMD) is a better predictor of an ICJ client’s needs than primary diagnosis (cancer type).Conclusion: A key strength of this system is its ability to rapidly ingest new data on its own and derive new predictions from those data. This means the model can guide service provision by forecasting demand based on actual or hypothesized data. The aim is to provide intelligent person-centered recommendations. The machine-learning model described here is part of a prototype software tool currently under development for use by the cancer support community.Disclosure: Funded by Macmillan Cancer Support",
author = "Kang, {Kean Lee} and Margaret Greer and Bown, {James L.} and Janice Preston and Judith Mabelis and Leigh-Anne Hepburn and Miriam Fisher and Falconer, {Ruth E.} and Sandra McDermott and Stuart Deed",
year = "2018",
month = "11",
day = "8",
doi = "10.1038/s41416-018-0299-z",
language = "English",
volume = "119",
journal = "British Journal of Cancer",
issn = "0007-0920",
publisher = "Nature Publishing Group",

}

Unsupervised machine learning of integrated health and social care data from the Macmillan Improving the Cancer Journey service in Glasgow. / Kang, Kean Lee; Greer, Margaret; Bown, James L.; Preston, Janice; Mabelis, Judith; Hepburn, Leigh-Anne; Fisher, Miriam; Falconer, Ruth E.; McDermott, Sandra; Deed, Stuart.

In: British Journal of Cancer, Vol. 119, 4, 08.11.2018.

Research output: Contribution to journalMeeting Abstract

TY - JOUR

T1 - Unsupervised machine learning of integrated health and social care data from the Macmillan Improving the Cancer Journey service in Glasgow

AU - Kang, Kean Lee

AU - Greer, Margaret

AU - Bown, James L.

AU - Preston, Janice

AU - Mabelis, Judith

AU - Hepburn, Leigh-Anne

AU - Fisher, Miriam

AU - Falconer, Ruth E.

AU - McDermott, Sandra

AU - Deed, Stuart

PY - 2018/11/8

Y1 - 2018/11/8

N2 - Background: Improving the Cancer Journey (ICJ) was launched in 2014 by Glasgow City Council and Macmillan Cancer Support. As part of routine service, data is collected on ICJ users including demographic and health information, results from holistic needs assessments and quality of life scores as measured by EQ-5D health status. There is also data on the number and type of referrals made and feedback from users on the overall service. By applying artificial intelligence and interactive visualization technologies to this data, we seek to improve service provision and optimize resource allocation.Method: An unsupervised machine-learning algorithm was deployed to cluster the data. The classical k-means algorithm was extended with the k-modes technique for categorical data, and the gap heuristic automatically identified the number of clusters. The resulting clusters are used to summarize complex data sets and produce three-dimensional visualizations of the data landscape. Furthermore, the traits of new ICJ clients are predicted by approximately matching their details to the nearest existing cluster center.Results: Cross-validation showed the model’s effectiveness over a wide range of traits. For example, the model can predict marital status, employment status and housing type with an accuracy between 2.4 to 4.8 times greater than random selection. One of the most interesting preliminary findings is that area deprivation (measured through Scottish Index of Multiple Deprivation-SIMD) is a better predictor of an ICJ client’s needs than primary diagnosis (cancer type).Conclusion: A key strength of this system is its ability to rapidly ingest new data on its own and derive new predictions from those data. This means the model can guide service provision by forecasting demand based on actual or hypothesized data. The aim is to provide intelligent person-centered recommendations. The machine-learning model described here is part of a prototype software tool currently under development for use by the cancer support community.Disclosure: Funded by Macmillan Cancer Support

AB - Background: Improving the Cancer Journey (ICJ) was launched in 2014 by Glasgow City Council and Macmillan Cancer Support. As part of routine service, data is collected on ICJ users including demographic and health information, results from holistic needs assessments and quality of life scores as measured by EQ-5D health status. There is also data on the number and type of referrals made and feedback from users on the overall service. By applying artificial intelligence and interactive visualization technologies to this data, we seek to improve service provision and optimize resource allocation.Method: An unsupervised machine-learning algorithm was deployed to cluster the data. The classical k-means algorithm was extended with the k-modes technique for categorical data, and the gap heuristic automatically identified the number of clusters. The resulting clusters are used to summarize complex data sets and produce three-dimensional visualizations of the data landscape. Furthermore, the traits of new ICJ clients are predicted by approximately matching their details to the nearest existing cluster center.Results: Cross-validation showed the model’s effectiveness over a wide range of traits. For example, the model can predict marital status, employment status and housing type with an accuracy between 2.4 to 4.8 times greater than random selection. One of the most interesting preliminary findings is that area deprivation (measured through Scottish Index of Multiple Deprivation-SIMD) is a better predictor of an ICJ client’s needs than primary diagnosis (cancer type).Conclusion: A key strength of this system is its ability to rapidly ingest new data on its own and derive new predictions from those data. This means the model can guide service provision by forecasting demand based on actual or hypothesized data. The aim is to provide intelligent person-centered recommendations. The machine-learning model described here is part of a prototype software tool currently under development for use by the cancer support community.Disclosure: Funded by Macmillan Cancer Support

U2 - 10.1038/s41416-018-0299-z

DO - 10.1038/s41416-018-0299-z

M3 - Meeting Abstract

VL - 119

JO - British Journal of Cancer

JF - British Journal of Cancer

SN - 0007-0920

M1 - 4

ER -