Agreement means to investigate to what extent different measurements agree with each other. Typical situations are:

- To compare two different measurements as when you evaluate a diagnostic test against a gold standard.
- Estimate inter rater agreement (if different users come to the same estimate)
- Estimate test-retest agreement (if the same user come to the same estimate if their testing is repeated)

The statistical approach most suitable depends on what level of measurement (or scale of measurement) is most appropriate for the investigated variable:

- Agreement between variables measured with the nominal scale
- The outcome is dichotomous (only two possible outcomes)
- Kappa coefficient
- Sensitivity / Specificity
- Likelihood ratio
- Predictive value of tests
- Etiologic predictive value (when there is no gold standard)

- The outcome can have more than two possible outcomes
- Kappa coefficient

- The outcome is dichotomous (only two possible outcomes)
- Agreement between variables measured with the ordinal scale
- Kappa coefficient
- Weighted Kappa coefficient

- Agreement between variables measured with an interval scale or a ratio scale
- Limits of agreement and/or Bland Altman plot
- Intra Class Correlation (ICC)

Always require a 95% confidence interval for estimates of sensitivity, specificity, likelihood ratios and predictive values! Point estimates without a confidence interval are useless.

## Estimating the clinical value of a test

Sensitivity and specificity informs us about the health of the diagnostic test being evaluated. This is great if you are a manufacturer of a diagnostic test but of limited value if you are a doctor. Likelihood ratios informs how much more information a test adds and predictive values informs us about the health of our patient (provides the probability that the individual has what we are looking for). The tables below aim to show the relation between likelihood ratio and predictive values.

Positive predictive value of test (PPV) | Positive likelihood ratio (PLR) | Interpretation |
---|---|---|

>60% | >1.5 | The test supplies useful information. |

>60% | <1.5 | Prior to testing it may be assumed that the patient probably has the disease. The test only increases knowledge marginally. |

<60% | >1.5 | The test only provides information of limited clinical value. |

<60% | <1.5 | The test is not useful in this situation |

Negative predictive value of test (NPV) | Negative likelihood ratio (NLR) | Interpretation |
---|---|---|

>90% | >0.67 | Prior to testing it may be assumed that the patient probably doesn’t have the disease. The test only increases knowledge marginally. |

>90% | <0.67 | The test supplies useful information. |

<90% | >0.67 | The test is not useful in this situation. |

<90% | <0.67 | The test only provides information of limited clinical value. |

(The limits of 60%, 90%, 1.5 and 0.67 in the table above are arbitrarily chosen to enhance understanding.)