Agreement means to investigate to what extent different measurements agree with each other. Typical situations are:
 To compare two different measurements as when you evaluate a diagnostic test against a gold standard.
 Estimate inter rater agreement (if different users come to the same estimate)
 Estimate testretest agreement (if the same user come to the same estimate if their testing is repeated)
The statistical approach most suitable depends on what level of measurement (or scale of measurement) is most appropriate for the investigated variable:
 Agreement between variables measured with the nominal scale
 The outcome is dichotomous (only two possible outcomes)
 Kappa coefficient
 Sensitivity / Specificity
 Likelihood ratio
 Predictive value of tests
 Etiologic predictive value (when there is no gold standard)
 The outcome can have more than two possible outcomes
 Kappa coefficient
 The outcome is dichotomous (only two possible outcomes)
 Agreement between variables measured with the ordinal scale

 Kappa coefficient
 Weighted Kappa coefficient

 Agreement between variables measured with an interval scale or a ratio scale

 Limits of agreement and/or Bland Altman plot
 Intra Class Correlation (ICC)

Always require a 95% confidence interval for estimates of sensitivity, specificity, likelihood ratios and predictive values! Point estimates without a confidence interval are useless.
Gold standard
(This section is under construction)
Estimating the clinical value of a test
Sensitivity and specificity informs us about the health of the diagnostic test being evaluated. This is great if you are a manufacturer of a diagnostic test but of limited value if you are a doctor. Likelihood ratios informs how much more information a test adds and predictive values informs us about the health of our patient (provides the probability that the individual has what we are looking for). The tables below aim to show the relation between likelihood ratio and predictive values.
Positive predictive value of test (PPV)  Positive likelihood ratio (PLR)  Interpretation 

>60%  >1.5  The test supplies useful information. 
>60%  <1.5  Prior to testing it may be assumed that the patient probably has the disease. The test only increases knowledge marginally. 
<60%  >1.5  The test only provides information of limited clinical value. 
<60%  <1.5  The test is not useful in this situation 
Negative predictive value of test (NPV)  Negative likelihood ratio (NLR)  Interpretation 

>90%  >0.67  Prior to testing it may be assumed that the patient probably doesn’t have the disease. The test only increases knowledge marginally. 
>90%  <0.67  The test supplies useful information. 
<90%  >0.67  The test is not useful in this situation. 
<90%  <0.67  The test only provides information of limited clinical value. 
(The limits of 60%, 90%, 1.5 and 0.67 in the table above are arbitrarily chosen to enhance understanding.)