# Agreements

Is your gold standard too good to be true?

Agreement means to investigate to what extent different measurements agree with each other. Typical situations are:

• To compare two different measurements as when you evaluate a diagnostic test against a gold standard.
• Estimate inter rater agreement (if different users come to the same estimate)
• Estimate test-retest agreement (if the same user come to the same estimate if their testing is repeated)

The statistical approach most suitable depends on what level of measurement (or scale of measurement) is most appropriate for the investigated variable:

Always require a 95% confidence interval for estimates of sensitivity, specificity, likelihood ratios and predictive values! Point estimates without a confidence interval are useless.

# Gold standard

(This section is under construction)

# Estimating the clinical value of a test

Sensitivity and specificity informs us about the health of the diagnostic test being evaluated. This is great if you are a manufacturer of a diagnostic test but of limited value if you are a doctor. Likelihood ratios informs how much more information a test adds and predictive values informs us about the health of our patient (provides the probability that the individual has what we are looking for). The tables below aim to show the relation between likelihood ratio and predictive values.

Positive predictive value of test (PPV)Positive likelihood ratio (PLR)Interpretation
>60%>1.5The test supplies useful information.
>60%<1.5Prior to testing it may be assumed that the patient probably has the disease. The test only increases knowledge marginally.
<60%>1.5The test only provides information of limited clinical value.
<60%<1.5The test is not useful in this situation
Negative predictive value of test (NPV)Negative likelihood ratio (NLR)Interpretation
>90%>0.67Prior to testing it may be assumed that the patient probably doesn’t have the disease. The test only increases knowledge marginally.
>90%<0.67The test supplies useful information.
<90%>0.67The test is not useful in this situation.
<90%<0.67The test only provides information of limited clinical value.

(The limits of 60%, 90%, 1.5 and 0.67 in the table above are arbitrarily chosen to enhance understanding.)