Implementation of Cluster Detection Mechanism of Syndromic Surveillance System in EDMON
Permanent lenke
https://hdl.handle.net/10037/16925Dato
2019-09-17Type
Master thesisMastergradsoppgave
Forfatter
Yeng, Prosper KandabongeeSammendrag
Background:
Early detection of disease outbreak has become a global challenge because existing disease surveillance systems, ostensibly, appears not to be efficient enough. As a result, there still exists disease outbreaks such as Ebola, heatwaves, malaria and flu with high case fatality rates in some parts of the world. New disease surveillance methods are therefore being explored to enhance the disease outbreak detection capabilities for timely interventions. For this reason, Electronic Disease Monitoring Network (EDMON) was initiated. EDMON is an ongoing research in syndromic surveillance at University of Tromsø, The Arctic University of Norway. The broad goal of this project is to detect the spread of contagious dieses at the earliest possible moment, and potentially before people know that they have been infected thus as early as the incubation stage of infection.
The results shall be visualized on real-time maps as well as presented in digital communication. The project uses self-recorded health related data from people with type-1 diabetes as input. The problem is that most syndromic surveillance systems do not detect disease outbreak as early enough. They detect outbreaks during or after visible symptoms stage of the infection which results in higher time lag. Therefore, health management is unable to manage the outbreaks early enough and this often lead to high disease burden.
Appropriate algorithms were explored through systematic review towards the implementation of a cluster detection mechanism in EDMON. In this study, a Hybridge of K-nearness Neighbour (KNN) and Cumulative Summation (CUSUM) known as EDMON-Cluster, were proposed and explored to assess the dual combination ability to augment for the gap of loss of power to detect outbreaks in a geographically disaggregated data.
Objective:
The main aim of EDMON-cluster was to implement and assess clustering methods of detecting infectious disease outbreak in EDMON. Specifically, spatial and temporal algorithms were hybridized in the implementation and their performance of detection such as sensitivity, specificity and timeliness were evaluated. Various challenges such as privacy and security, geographical location estimation and visualization were considered.
Materials and Methods:
Synthetic or simulated data was generated to consist of required parameters such as infected Individuals’ detections, geolocations and respective time stamp of occurrences. Synthetic dataset of geolocations of centroid of post codes was also generated. K-nearest neighbour spatial classifier was used to cluster the detected infected Individuals into various centroid of post code areas. This was based on proximity of distance between geolocation of detected individuals and centroid of post codes of near neighbours. Cumulative summation (CUSUM) was then used to implement the temporal aspect of the clustering. A vertical baseline data of an average of one week was used to compare to a week’s scanning window. Z-score was used for thresholding while prototyping was adopted in the entire study. The performance of the KNN algorithm was assessed by determining the proportion of infections which were accurately classified. The Sensitivity, Specificity of the CUSUM method were also evaluated by varying the input data through injection of outbreak spikes at various times.
Results:
The KNN algorithm, which was implemented in the EDMON-Cluster, recorded 99.52% accuracy when it was evaluated with simulated dataset containing geolocation coordinates among other features and SckitLearn KNN algorithm achieves an accuracy of 93.81% when it was tested with the same dataset. After injection of spikes of known outbreaks in the simulated data, the CUSUM module was totally specific and sensitive by correctly identifying all outbreaks and non-outbreak clusters. Indication of outbreaks on visual maps and through alarm and SMS alerts were successful. The entire process was estimated to be 12.5 minutes with the simulated data. One-way hashing and deidentification were some of the data anonymization techniques which were adopted in the study to obscure privacy as recommended by the General Data Protection Regulation (GDPR).
Conclusion:
Basically, KNN and CUSUM algorithms were fused together as a spatiotemporal measure known as EDMON- Cluster. A prototype approach was adopted with synthetic data. With reference to the outstanding performance of the EDMON- Cluster, there is enormous motivation to further evaluate the dual paired algorithms with real dataset towards empirical implementation in EDMON. EDMON- Cluster exhibited a potentially useful method in comparison with other surveillance methods which can further be assessed with real data for practical implementation in EDMON. Suitable methods for obtaining a balance point of anonymizing geolocation attributes towards obscuring the privacy and confidentiality of diabetes subjects while maintaining the data requirements for public good, disease surveillance, remains a challenge.
Forlag
UiT Norges arktiske universitetUiT The Arctic University of Norway
Metadata
Vis full innførselSamlinger
Copyright 2019 The Author(s)
Følgende lisensfil er knyttet til denne innførselen: