A Probabilistic Approach to Extract Qualitative Knowledge for Early Prediction of Gestational Diabetes

Athresh Karanam, Alexander L. Hayes, Harsha Kokel, David M. Haas, Predrag Radivojac, Sriraam Natarajan

Accepted at the 2021 International Artificial Intelligence in Medicine (AIME) Conference.

Badge showing the doi link for the paper

Paper

Abstract

Qualitative influence statements are often provided a priori to guide learning; we answer a challenging reverse task and automatically extract them from a learned probabilistic model. We apply our Qualitative Knowledge Extraction method toward early prediction of gestational diabetes on clinical study data. Our empirical results demonstrate that the extracted rules are both interpretable and valid.

Overview

Qualitative Influence Statements” are a concise way to express how variables interact under all possible values that the variables can take. Previous work incorporated these as inductive bias to guide learning, but we wanted to explore whether these statements could be learned directly from data.

A Bayesian Network of eight variables learned for this publication. The conditional independence structure of the graph shows that GDM directly depends on Age, BMI, and Race.
We first learned a Bayesian Network for the domain using the PC causal learning algorithm. The variables represent what was known in the first month of pregnancy, and the dependent variable Gestational Diabetes Mellitus (GDM) is diagnosed near the conclusion of a pregnancy at the nine-month mark.

Here we incorporated a probabilistic model in the form of a Bayesian Network and developed a technique we named QuaKE (for Qualitative Knowledge Extraction) to extract rules like the following:

\[ \text{BMI}_{\prec}^{M+} \text{GDM} \]

This notation is read as “Risk of GDM monotonically increases with BMI,” meaning that higher BMI directly translates into a higher risk of gestational diabetes.

After extracting these statements over all of our variables, we asked how well they aligned with previous knowledge, and found the precision of our method was around 91% across five validation folds.

A longer write-up is available on the Starling Lab QuaKE project webpage.

Spotlight Presentation

The spotlight was part of the AIME 2021 conference.

Slides

A slidedeck preview should load below. The slides may be viewed here, or you can download them as a pdf.

Citation

Please use the following citation when building on ideas of this work:

Karanam A., Hayes A.L., Kokel H., Haas D.M., Radivojac P., Natarajan S. (2021) A Probabilistic Approach to Extract Qualitative Knowledge for Early Prediction of Gestational Diabetes. In: Tucker A., Henriques Abreu P., Cardoso J., Pereira Rodrigues P., Riaño D. (eds) Artificial Intelligence in Medicine. AIME 2021. Lecture Notes in Computer Science, vol 12721. Springer, Cham. https://doi.org/10.1007/978-3-030-77211-6_59

@inproceedings{karanam2021probabilistic,
  author="Karanam, Athresh and Hayes, Alexander L. and Kokel, Harsha and Haas, David M. and Radivojac, Predrag and Natarajan, Sriraam",
  editor="Tucker, Allan and Henriques Abreu, Pedro and Cardoso, Jaime and Pereira Rodrigues, Pedro and Ria{\~{n}}o, David",
  title="A Probabilistic Approach to Extract Qualitative Knowledge for Early Prediction of Gestational Diabetes",
  booktitle="Artificial Intelligence in Medicine",
  year="2021",
  publisher="Springer International Publishing",
  address="Cham",
  pages="497--502",
  isbn="978-3-030-77211-6"
}