Purposes and limitations of IVD tests in the context of a pandemic
Summary
 Since the start of the SARSCoV2 pandemic (the virus causing Covid19), RTPCR, IgM, Antigen test, Sensitivity, False Positive and other invitro diagnostic jargon are front and central in the news.
 However, invitro diagnostic Key Performance Metrics remain misunderstood by the general public and by the mainstream media.
 More concerning, the limitations of the different categories of tests often seem to be overlooked when Public Health authorities procure tests for masstesting purposes.
Commentary:
Without adequate understanding of the science behind testing and why false positives and false negatives happen, we might drive the public to further mistrust — and even question the usefulness — of public health and testing.
Knowledge is key to curb this pandemic!
1  Purposes of testing
In the context of infectious diseases, testing serves four different purposes, actioned at different stages of the infection progression, which require different test characteristics.



Screening
“Screening interventions are designed to identify conditions which could at some future point turn into disease, thus enabling earlier intervention and management in the hope to reduce mortality and suffering from a disease”^{1}.
Based on this definition, screening tests are typically carried out in a context of low prevalence e.g. airports.

Public Health surveillance
“Public Health surveillance describes the continuous, systematic collection, analysis and interpretation of healthrelated data needed for the planning, implementation, and evaluation of public health practice."^{2}
In the context of infectious diseases, the purpose of the surveillance is to alert on outbreaks of contagions. This is done via sampling of an atrisk segment of the population, reporting and bioinformatics analysis.
Based on this definition, Public Health surveillance tests are also carried out in a context of low prevalence.

“Immunity Passport”
“Immunity certificates are a legal document granted by a testing authority following a serology test demonstrating that the bearer has antibodies making them immune to a disease.”
Based on this definition, tests for the purpose of delivering “Immunity Passports” may happen in variable prevalence situations, from low to high.
Note: in the context of the SARSCoV2 pandemic, the WHO stated that: “There is currently no evidence that people who have recovered from COVID19 and have antibodies are protected from a second infection.”^{3}

Medical diagnosis of symptomatic patients
The purpose of medical diagnosis is to explain a patient’s symptoms and signs and or confirm their underlying cause e.g. triage of patients with respiratory symptoms in A&Es (ERs).
Based on this definition, screening tests are typically carried in a context of high prevalence.
2  Definitions of test’s “accuracy”

Labbased (intrinsic) statistics:
The Analytical Performance is primarily defined by its Accuracy which results from the combination of its Sensitivity or PPA
(Percent Positive Agreement) and its Specificity or NPA (Percent Negative Agreement)

Condition (as determined by "Gold standard") 

Positive 
Negative 
Test outcome 
Positive 
True Positive 
False Positive (Type II error) 
→ Positive predictive value 
Negative 
False Negative (Type I error) 
True Negative 
→ Negative predictive value 

↓ Sensitivity 
↓ Specificity 

Table 1: Consolidation of False Positive, True Positive, False Negative, True Negative, PPV and NPV (Wikipedia)
See Appendix 1 for detailed definitions and interpretations, or, if you prefer to hear a short tutorial,
here is the nice one from Dr. David Felson, a Professor of Medicine in the Boston University School of Medicine.


False Negative: a false negative is an error in which a test result improperly indicates no presence of a condition (the result is negative), when in reality it is present.

False Positive: a false positive is an error in data reporting in which a test result improperly indicates presence of a condition, such as a disease (the result is positive), when in reality it is not present.

Sensitivity (PPA) measures the false negative rate and is expressed as a percentage:
Out of 100 truly positive patients, how many are detected as positive by the test? The higher the sensitivity, the fewer the numbers of false negatives.

Specificity (NPA) measures the false positive rate and is expressed as a percentage:
Out of 100 truly negative patients, how many are detected as negative by the test?
The higher the specificity, the fewer the numbers of false positives. An example here would be if the test detects
an antibody response to a different but related coronavirus. The individual would be informed that they have antibodies to covid19 when in fact it would reflect infection with a different but similar coronavirus.

Accuracy = (True Positives + True Negatives)/ (All test results)


False Negative: a false negative is an error in which a test result improperly indicates no presence of a condition (the result is negative), when in reality it is present.

False Positive: a false positive is an error in data reporting in which a test result improperly indicates presence of a condition, such as a disease (the result is positive), when in reality it is not present.

Sensitivity (PPA) measures the false negative rate and is expressed as a percentage:
Out of 100 truly positive patients, how many are detected as positive by the test? The higher the sensitivity, the fewer the numbers of false negatives.

Specificity (NPA) measures the false positive rate and is expressed as a percentage:
Out of 100 truly negative patients, how many are detected as negative by the test?
The higher the specificity, the fewer the numbers of false positives. An example here would be if the test detects
an antibody response to a different but related coronavirus. The individual would be informed that they have antibodies to covid19 when in fact it would reflect infection with a different but similar coronavirus.

Accuracy = (True Positives + True Negatives)/ (All test results)
Commentary:
 False positive results on a COVID19 test can cause undue stress on individuals as they try to navigate their diagnosis, take days off work and isolate from family
 False negative test results are even more dangerous, as people may think it is safe and appropriate for them to engage in social activities, hence contributing to the spread of the virus
Sensitivity and Specificity are therefore Labbased metrics that presuppose to know whether the patient/sample is positive or negative. Therefore these statistics, although often reported by the media
(see Appendix 2 for a review of 3 mainstream media articles), can’t really help us to determine how good a test is at predicting the disease state of a person. In fact, the Sensitivity and the Specificity
tells us about what happens if we already know what the right answer should have been.
To drive the point home, think about this: in a population in which no one has the disease, even a test that is terrible at detecting anyone with the disease will appear to work great. It will “correctly”
identify most people as not having the disease.
Commentary:
 It is erroneous and misleading to use the above Labbased intrinsic statistics to estimate the “accuracy” of a test as a diagnostic tool, i.e. to determine whether or not a person carries a disease.
This common mistake is illustrated by the simple numerical example:
Disease prevalence: 1/1,000
Accuracy: 95% (Sensitivity: 100%, Specificity: 95%)
Question:
A person is tested positive. What is the probability that this person has the disease?
a) 100%
b) 95%
c) 5%
d) 2%
(answer in Appendix 3)

“Realworld” Predictive Statistics:
In real life, by definition, the True Positive or True Negative status of the patient are unknown, the above Labbased statistics of Sensitivity and Specificity are therefore useless.

The probability that a Positive result effectively corresponds to a Positive patient is measured by the Positive Predictive Value (PPV)
Conversely,

The probability that a Negative result effectively corresponds to a Negative patient is measured by the Negative Predictive Value (NPV)

Positive Predictive Value (PPV):
The Positive Predictive Value is the probability that a subject/sample that returns a positive result is really positive
Or, in other words, the PPV is the proportion of positive results of the test that are true positive results.

Negative Predictive Value (NPV):
The Negative Predictive Value is the probability that a subject/sample that returns a negative result is really negative
Or, in other words, the NPV is the proportion of negative results of the test that are true negative results.
The Role of Prevalence:
A critical concept is that the PPV and NPV depend on the prevalence of the disease. The rarer the condition, the more likely a negative test result is truly negative, and the less likely a positive test result is truly positive.
Prevalence, or the proportion of people in a population who have the condition at any one time, increases the PPV but decreases the NPV.
Appendix 1 explains why the PPV provides a meaningful assessment of a test ability to estimate the
probability that a positivelytested person has the targeted disease.
The relevance of NPV and PPV in the abovedescribed diagnostic programs is described in Table 2
Diagnostic program 
Purpose of the test 
Typical prevalence 
Important Predictive Value 
Important analytical characteristic 
Screening (e.g. for infection) 
Identify, on the spot, as many positive people (TP) as possible, while not loading the health care system with healthy people (FP) 
Low (0%10%) 
PPV 
Specificity 
Public Health Surveillance (e.g. for cancer) 
Identify as many positive people (TP) as possible. (FP) are secondary concerns as positive people can be retested with an orthogonal test^{4} 
Low (0%20%) 
NPV 
Sensitivity 
Immunity Passport (e.g. postvaccination campaign) 
Issue the passport to the persons who are effectively immunised (TP); (FN) are secondary concerns (low Health Economics cost) 
Medium to High (30%70%) 
PPV 
Specificity 
Medical diagnosis and management of symptomatic patients 
Point the Practitioner toward the relevant disease (TP). (FN) are secondary concerns as the Practitioner will act on the symptoms. (FP) are a concern as an unjustified intervention might be decided 
High (50%100%) 
PPV 
Specificity 
Table 2: Key predictive value for various diagnostic programs
Commentary:
As the true prevalence is not initially known, and can be dynamic, diagnostic tests always play catchup with reality and should be deployed in phases (different tests at different prevalence levels)
3  Clinical relevance:
Appendix 1: PPV & NPV definitions and formulas

Negative Predictive Value (NPV):
The negative predictive value is the proportion of negative results of the test that are true negative results.
NPV =
number of true negatives
/
number of true negatives + number of false negatives
NPV =
specificity x (1prevalence)
/
(1sensitivity) x prevalence + specificity x (1prevalence)

Positive Predictive Value (PPV):
The positive predictive value is the proportion of positive results of the test that are true positive results.
PPV =
number of true positives
/
number of true positives + number of false positives
PPV =
sensitivity x prevalence
/
sensitivity x prevalence + (1specificity) x (1prevalence)
As expressed in the formulas, both the NPV and the PPV are not intrinsic characteristics of the test analytical performance; they also depend on the prevalence of the disease.
 the positive posttest probability (the probability of having the target condition if the test falls out positive), is numerically equal to the PPV
 the negative posttest probability (the probability of not having the target condition if the test falls out negative) is numerically complementary to the negative predictive value ([negative posttest probability] = 1  [NPV]),
The respective curves of negative posttest probability and positive posttest probability (PPV) as a function of the prevalence are shown in graph 2.
Graph #2: Positive and negative posttest probability as a function of Prevalence (Pretest probability)
Interpretation:
For a given expected prevalence of the disease (Pretest probability on the horizontal axis), the vertical intersection with the lower elliptic curve gives horizontally the negative posttest probability.
Respectively, the vertical intersection with the upper elliptic curve gives horizontally the PPV.
It results from the above curves that the PPV provides a more meaningful assessment a test ability to estimate the probability that a person has the targeted disease.
Orthogonal tests:
Two complementary that react two different biomarkers of the same disease. Consecutively using orthogonal tests results in a significant improvement of the specificity and therefore of the PPV.
https://en.wikipedia.org/wiki/False_positives_and_false_negatives#true_positive
https://en.wikipedia.org/wiki/Pre_and_posttest_probability
Appendix 2: Review of media article by www.healthnewsreview.org
Appendix 3: Quizz
Numerical example:
Disease prevalence: 1/1,000
Accuracy: 95%
Sensitivity: 100%
Specificity: 95%
Question:
A person is tested positive. What is the probability that this person has the disease?
(a) 100%
(b) 95%
(c) 5%
(d) 2%
Answer: 2%
Let’s think of the implied statistics, based on a population of 1,000 persons:
Population = 1,000 
Disease 
+ 
 
Test 
+ 
TP: 1 
FP: 50 
 
FN: 0 
TN: 949 
NPV =
number of true negatives
/
number of true negatives + number of false negatives
=
949
/
949 + 0
=
100%
PPV =
number of true positives
/
number of true positives + number of false positives
=
1
/
1 + 50
= 1.96%
~ 2%