The Cholera Risk Profile Demonstration

By: Alta de Waal and Anneline Adams, CSIR South Africa, May, 2003


Bayesian network for cholera risk profile

Cholera is an acute intestinal infection caused by the bacterium Vibrio Cholerae. One of the symptoms is watery diarrhea that can quickly lead to severe dehydration and death if treatment is not promptly given. The cholera epidemic of 2000/2001 has gripped several rural parts of South Africa in what has developed into the most serious cholera epidemic yet experienced in South Africa. With the movement of people between provinces in South Africa and southern African countries, the cholera bacterium spread to seven of the nine provinces in South Africa. This rapid spread of the outbreak calls for more effective planning and management of resources.

One effort to support resource planning and management is to predict the outbreak of cholera. The level of complexity of such a model is very high because the causes of cholera is either unknown and/or incorporates high levels of uncertainty. Our approach rather models the vulnerability of a village to learn more about why one village can survive a cholera outbreak in a region while the neighboring village suffers severely from the outbreak. This understanding of vulnerability to the epidemic can aid decision-makers in allocating funds to the most needed resources that will really make a difference in the vulnerability index of a village. The model will also evaluate the relevancy and impact of awareness programmes.

For this approach we need to extract expert knowledge that is implicit in the minds of Subject Matter Experts (SMEs). The SMEs comprises of field workers, cholera experts, historical data, etc. This knowledge is integrated in a Bayesian network. Finally, the integrative model must be implemented in such a way that is easy to use for non-sophisticated fieldworkers on previous generation computers.

Building the model

Identification of measurable criteria for the vulnerability profile of a village that are meaningful to decision-makers was the first task in building the model. The intent was to establish those attributes that would be used by decision-makers and fieldworkers to evaluate the vulnerability of a village. While this may seem like an obvious starting point, inadequate attention to this step may lead to an incomplete analysis or an analysis of the wrong problem with respect to cholera outbreaks. The facilitation part of our process was used to elicit expert knowledge. Consensus was reached between experts to choose five criteria as so-called model endpoints (Borsuk et al., 2002) to characterise the vulnerability profile of a village. They are Hygiene, Status of water source, Use of medical facilities, Sanitation and Community Awareness. It was decided that the states of the model endpoints would be a scale between 0 (zero) and 10 (ten). Zero represents the highest risk (lowest coping index) and ten represents the lowest risk (highest coping index).

Each one of the model endpoints is influenced by variables. The variables are habits of the villages that they have control over (for example: storage of drinking water, sanitation habits) as well as external influences that they have no control over (for example: geographical position of village, availability of medical facilities).


In general, remote areas are most struck by cholera outbreaks. No or little infrastructure exists in these areas, therefore making remote access to computers almost impossible. In some cases, fieldworkers with no or very little computer background must be able to understand and use the model. This implies that the model must be easy to use on an inexpensive computer.

We wrote a front-end for the model that enables the decision-maker to design an optimal risk profile, choosing appropriate risk indexes for the five criteria. This evidence is propagated throughout the network to update all the other conditional probabilities in the network. Now the decision-maker and fieldworkers can see what is necessary in communities in order to obtain an ideal risk profile. Reasoning in the opposite direction is also possible - for given information gathered from fieldworkers, what is the real risk profile of a community?

Model documentation

Below is a set of HUGIN widgets for interacting with the model (click on the probability bar to instantiate a node or remove evidence):

The risk indexes are discretized in order to simplify the implementation of the Bayesian network. An index of "0-2" indicates a poor score and an index of "8-10" index indicates a high or good score. A convenient means to display the results is in the form of spider/polar plots. A polar plot has the useful property that the area .within. the curve is indicative of the total score, while linear plots carry the same information less effectively: The larger the area, the better the risk profile of a community or village. The implementation on this website does not have the functionality to propagate information in the network back to the polar plot, although in principle, it is possible.

Status of water source

Cover drinking water

Water sources


Wash hands before handling food

Method of drying hands

Rubbish disposal

Infant feacal disposal

Community awareness

Perception of awareness programmes

Understanding of correlation between bad hygiene and diseases

Use of medical facilities

Spread of mis-information

Availability of medical facilities


Toilet in household

Child defecation

Below is a polar plot of variables Use of medical facilities, Community awareness, Hygiene, Sanitation and Status of Water source. By clicking on one of the tics on each dimension the corresponding variable is instantiated to reflect the selection.


Fenton, N. (1999). Bayesian Probability. The Centre for Software Reliability

Jensen,F.V. (1996). An Introduction to Bayesian Networks. Springer, New York

Pearl, J. (1988). Probabilistic Reasoning in Intelligent Systems. Morgan Kaufmann Publishers, INC, San Fransisco, California

Borsuk, M.E., Stow, C.A., Reckhow, K.H., Ecological Prediction using causal Bayesian Networks: A case study of eutrophication management in the Neuse River Estuary, DRAFT Manuscript, 2002

Weekly report on the cholera situation and emergency water supply and sanitation interventions, 29 January 2001 - 2 February 2001, Head Office Unit, National DWAF Office, Environmentek, CSIR. World Health Organization Fact Sheet on Cholera, www.who.iny/health-topics/cholera.htm

Useful references for those interested in Bayesian networks include:

Kjærulff, U. B. and Madsen, A. L. (2013) Bayesian Networks and Influence Diagrams: A Guide to Construction and Analysis. Springer, Second Edition.

Contact information

Alta de Waal, CSIR South Africa, alta.dewaal (a)