Transcript of:Clinical Epidemiology and the Path to Better Medicine

Transcript of: Clinical Epidemiology and the Path to Better Medicine

Presented by Dr. Miller on 29 November 2020   Watch the Webinar 

Now, today's webinar will be presented by Dr. Dave Miller. Dr. Miller is a veterinarian with a PhD in
clinical science. His PhD studies emphasized the epidemiology of agent transmission, diagnostic testing,
and one health. He has advanced clinical specializations in animal welfare and zoological medicine as
well as experience in a variety of laboratory research, regulatory, exhibition and other clinical settings.
In addition to his broad research and publication background, he has served as a research and
consultant. He has served on and led a number of committees responsible for policy and guidelines that
are used nationally and internationally. Thank you so much for being here today, Dr. Miller, and I'll turn
the presentation over to you now.
Okay, can you hear me?
Okay, great. Well, thank you, for everybody who's joining us. Hope everybody's doing well. And that
there's not anything exciting in your life like fires or COVID or anything like that. So, what we'll be doing
today is talking about clinical epidemiology, and why we should care about it, what the outcomes are.
So, we'll be talking about the two main areas of diagnostic tests, as well as in a clinical medicine setting.
Now, as I was preparing the talk, I was all excited because for those of you that are familiar with the
Promed listserv, that covers all the epidemics and diseases going on around the world. I thought there
were a lot of cool things related to COVID-19. But the more I got into it, the more I realized we really
didn't have the time for a wallowing a lot in this. But some of the themes that come up basically fall into
the category of where was COVID-19 occurring?
How many people are infected? What's the accuracy of the test? What treatments were? Things of that
nature. And those are all things that are covered under clinical epidemiology. So, why are we talking
about diagnostic tests? The reason why would be if you're developing a test, you might want to use it
for clinical diagnosis. You might want to use it for screening people for who might have a given
Might be used for population surveillance, epidemiologic studies, or risk analysis such as, is there a risk
of COVID being introduced, or something of that nature. So, those are all things that you want to keep in
mind when you're developing a test. So, one of the things I'd like to go back to the beginning to make
sure that we're all on the same wavelength is to realize that there's different types of diagnostic tests.
So, even a physical exam, whether you're listening to the heart or listening to lungs, or feeling whether
it's warm or swelling, that is the type of diagnostic test. Tests that we're a little more familiar with are
those that are looking at a chemical or cellular level for any abnormality, saying the blood or the
cerebral spinal fluid or things of that nature. There can be physiological tests such as an
electrocardiogram or lung function sorts of tests.
Where I have a particular interest in infectious diseases. And two general categories are tests that
identify the agent or indirect test that identify the antibody response to the agent. There are genetic
tests that are being developed for various conditions and are being used for various conditions. And also
imaging is another type of test, meaning, if you're doing an X-ray or ultrasound or that thing.
So, an important thing that we want to talk about and consider is how accurate that diagnostic test is.
And there's a couple of clinical epidemiology concepts that I'd like to make sure we're all familiar with.
One of them is sensitivity. And that's a proportion of infected individuals that are correctly classified as
affected or infected. Then, there's specificity where if you're looking at those that are not affected or
infected, and how many of those folks are correctly classified.
So, this two by two table helps a little bit in terms of, hopefully, clarifying that concept. So, if we have an
individual, they can be either positive or negative for a given condition. And then, we've got a test result,
This transcript was exported on Nov 29, 2020 – view latest version here.
18 – Miller – Clinical Epidemiology and the Path… (Completed 11/29/20)
Transcript by
Page 2 of
which is either positive or negative for a given condition. Recognizing that there is potential for false
positives, false negatives. But in terms of sensitivity, that means that all the individuals that are infected
are also test positive, whereas all the individuals that are negative are also test negative.
So obviously, we'd like 100% in both of these categories. Some tests may come close to that, but they
usually don't come close to 100%. Now, these two by two tables can be a lot of fun for all kinds of mind-
numbing calculations. But I've been informed that mind-numbing is not allowed for these sorts of
webinars. So, we'll just leave you slightly excited about what could be done with these sorts of things
and move on a little bit here.
So, one thing I want to make sure because many of you are in the business of developing diagnostic
tests are the difference between clinical epidemiology concepts, and then those in analytical
terminology that would be used in the lab. So in the lab, when you're talking about sensitivity, what
you're usually talking about is the smallest detectable unit of change, whether it be biochemical
parameter or physiological parameter, things of that nature.
So that differs from what we were just talking about when the test classification is being accurate. The
other place is a specificity, which is a little bit more similar to the specificity that we just talked about,
but it's a little bit more concerned about misclassifications for reasons such as cross-reactivity, or other
things that can happen within the test. So, we run a test. And again, in terms of general categories,
you're going to have two different types of tests.
One is dichotomous or types of outcomes. One is dichotomous, which means either test positive, test
negative, things of that nature. Or we can have quantitative tests, which is something that a lot of labs
are going to run. A dichotomous test where, say something in the clinic, or changes color, some of you
may be developing. But let's talk a little bit more about the quantitative test. Now, the concern that
we've got with a quantitative test, meaning, you got a series of numbers, is the cutoffs are somewhat
And that's something that burst my bubble. The first time I got into one of these studies, when I asked
one of my mentors, how you decide what the cutoff was, and he basically said, well, he established it
based on what he thought was right. Well, that wasn't entirely satisfying to me. And since I've learned
more, basically, what it boils down to is there always is going to be an arbitrary component. But what's
important is to understand and be transparent about what it is that you value or prioritize, and what the
tradeoffs are that you're willing to accept.
So, let's explore this a little bit more in a little bit more detail here. So here, we have our standard bell-
shaped curve that everybody's probably seen before. And so, let's say we are running a biochemical test
for sodium or potassium or liver enzymes or anything like that. This is our standard bell-shaped curve.
We're going to call this population everybody normal.
But because we are usually saying 95% confidence level is what we're looking at, by definition, 2.5%
below the curve. And I'm sorry, I have trouble keeping the cursor up here, 2.5% below the curve, and
2.5% above the curve. By definition, if we have this normal population, or 5% total are going to be
classified as abnormal right off the bat. So, what does that mean? We have 100 normal patients.
And let's say we have 100, each one of those normal patients receives a 20 test panel like a lot of
biochemical panels that the doctors may run. And each test has that 95% reference range. The question
that I'm throwing out to you is how many of these 100 patients are going to be classified as normal for
every single test on average. So, your options are, and this is going to be a poll, 100%, 95%, or 36%. So,
it's time to go ahead and vote.
And your lives don't depend on this, so make a choice. And we can then move on, and talk about a little
bit more. Okay. So, looks like everybody's voting for 95% here. And nobody voted for the others. 95%
This transcript was exported on Nov 29, 2020 – view latest version here.
18 – Miller – Clinical Epidemiology and the Path… (Completed 11/29/20)
Transcript by
Page 3 of
makes sense. But if you do the math, let's say look at the first test for given person, 95% of the people
are collected correctly classified. Then, of that 95%, another 95% for the next test, and so on. So, the
way to deal this is a power calculation.
If you do 95% to the 20th power, then it works out to about 36% of those individuals actually are going
to be classified as normal within that reference range for every single analyte. Now in the real world, an
experienced clinician is going to say, "Well, gee," they know that. And particularly for values that are just
outside the reference range, they're probably going to ignore it, and look at some of the other factors
that are involved with that.
So, as an example, if it's just a single calcium that's slightly out of whack, they're going to probably
ignore it. Whereas, if it were a bunch of liberal values that were all just barely out of normal, they might
want to look a little bit closer and think about it a little bit more. And I'm not able to progress this.
Samuel, I'm hitting my arrow here, let's try this.
You click on that.
There we go, I'm sorry. And then, what I was just talking about work for continuous values. But if you
had a histogram of discrete values, you'd still have a similar challenge in terms of where are you going to
call this cutoff. So, that's basically a real-world challenge. And something to think about when you're
developing your lab assays. So, the reality is that there's always going to be a tradeoff between test
sensitivity and specificity.
And that's going to be basically a function of that cutoff. And so, there are a number of different ways of
dealing with this. One that I'm going to raise is receiver operating curves, which is actually interesting.
It's borrowed from military operators back in World War II. And basically, we're looking at with this
graph is let's say we've got the false positives here graph against the true positives.
And with this straight line here, what that basically is is 50% for each of them, which is basically
equivalent to tossing a coin, which means we have a test that really isn't that much value. Now, a better
test is if we're a little bit closer to being in this upper left-hand corner, where all our true positives are
right in line, and invest is right up here. So, the value here is that even though we still have to make
these tradeoffs, the value here is that at least we know we can do, there we go, calculation.
So, let's say if we want to maximize the number of individuals that are correctly classified, we can do
area under the curve calculations, which would be right around here. But for other reasons, we might
want to pick off a cutoff here or here. And the value of that is then that helps us decide, at least we
know what the impact of our decisions are. Now, in the real world, things are not as clean. Sensitivity is
affected by the stage of infection or the immune status.
So, as an example, let's say somebody were to cough on me with COVID or the common flu 10 minutes
ago, we take the test now, the test is probably going to come up negative because I'm not yet infected.
So, the sensitivity of the true positive is not going to be there until I developed it further along my
infection. Also, immune status. If somebody's immune suppressed, you may not see a response for an
indirect test such as an antibody test.
Then, if you're looking at specificity, we talked a little bit about cross-reacting agents a little bit in the
analytical concept. And that crosses over to clinical epidemiology. Vaccination can also be a challenge.
So, one of the questions about COVID has been if we develop a vaccine, can we tell who's been truly
infected, who's been vaccinated. And that makes an impact in terms of monitoring how the population
is dealing with the infection.
But now, the good news about COVID is it appears that we can distinguish, but that isn't always the case
for some infectious agents. So, what we want to do is part of this picture's we're also going to validate
This transcript was exported on Nov 29, 2020 – view latest version here.
18 – Miller – Clinical Epidemiology and the Path… (Completed 11/29/20)
Transcript by
Page 4 of
these tests to see how accurate they are. What we always would like to have is a gold standard to
compare by. That doesn't always work out that way. There are some nice mathematical techniques for
estimating test accuracy in the absence of a gold standard.
But just like with some of the things we've talked about, we don't have enough time right now. Now,
one of the things that's important when you're validating that test is what population are you sampling?
And how are you sampling? And the sample size, which obviously, I'm sure most of you think about
when you're developing a test. Also, an important part of the picture is repeatability and reproducibility.
If you were to repeat the test again, the next day or the next month? Or if you compare test results
between different labs, are they going to be pretty comparable results? Or do we have a problem in
terms of consistency? Now, again, because this is the real world, things can get even more complicated.
So, what we were talking about was test sensitivity and specificity. That's a test characteristic that isn't
necessarily static, it will vary by population.
And so, that's why the concept that comes into play is also predictive value, predictive value of true
positives. So, when you're testing an individual and it comes up positive, how confident can we be that
that truly is positive? And then, there's proportion of true negatives. How many people, if they test
negative, are going to be truly negative? And population infection rate is also involved with this as well.
So, that doesn't entirely make sense.
The term here is population prevalence, which means, at a given state and time, how many individuals
are truly infected. So, the way this works is if we were to look at a graph, and we were to graph disease
prevalence against the predictive value of that positive test or negative. So, if we think about it, if the
disease prevalence is zero or close to it, and you get a positive, then the positive predictive value is
going to be basically zero.
Because the odds that is correctly classifying somebody as a positive are pretty low. On the flip side, if
everybody's infected in a population, and you get a positive test result, yeah, you're pretty much sure
that you are right on track. So, we don't want a straight line for this graph. Just like with the graph that
we showed a little while ago for the ROC, the further this central area is up to the top, the better.
Now, the same thing is relevant to the negative predictive value, whereas if nobody's infected, and
everybody gets a negative test value, yeah, you're pretty confident in that value. And then, vice versa if
disease prevalence is 100%. You're not going to expect many that are going to be correctly classified if it
says positive. So again, we're going to want the curve for this to be up closer to the corner here if we
have a better performing test.
So, here's another poll for everybody. Given the graph that I just showed, is it practical to have an in-
house office test for both influenza and anthrax? And your options are yes, no, or for those of you that
don't want to commit, we've got a third option too, and you don't necessarily have to commit. And so,
we will see how we're doing on this. Okay, so we have a few people that want to make a commitment.
And many of you say no, which is the correct answer, because the reality is that anthrax is an
uncommon disease in most settings. So, if we get a positive, it's going to probably be a false positive.
Given what a serious disease it is, there's going to be a lot of excitement, a lot of concern. So, routinely
testing for anthrax is probably not a great idea. On the other hand, testing for influenza, it's fairly
common and may have some value for us.
So, let's make things even more complicated. Not to blow your minds and not to imply that tests are
never accurate, but in terms of developing a test and interpreting the test, there's some additional
things that we need to think about in terms of populations. So, these are some of the questions that
might be of importance to a population. First off, is the disease present? Is COVID back in the spring?
This transcript was exported on Nov 29, 2020 – view latest version here.
18 – Miller – Clinical Epidemiology and the Path… (Completed 11/29/20)
Transcript by
Page 5 of
Was COVID present in the United States, or any other country? If it was prevalent, or wasn't present,
how many people were affected? And what's our confidence levels? Or how accurate is that test? Or if
nobody was here, what's the risk of introducing disease into that population? So, we could potentially
use the same test for each one of those questions. But the interpretations and how we use them may be
a little bit different.
So, for picking out a test, things that we need to think about if we have a range of different tests, what's
the goal of the testing, what are the test characteristics in terms of accuracy as we talked about first.
And then, we talked about how the prevalence of infection or disease in the population can affect that.
And what ends up is predictive value. And the new concept that we're going to talk about briefly is the
cost of the test.
Because depending on the impact of disease, the cost of the test, that can make a big difference. So, if
we have a disease that's very common and the test is very expensive, we might not run it. If it's a very
uncommon disease such as maybe a birth defect, it might be worth finding out early on and paying that
extra money to find out what we need to be doing to address this. And then of course, related to that in
segue is the impact of that testing.
So, like I implied before with the anthrax, if you've got a positive, and it's a false positive, that has a
pretty dramatic effect on the patient. And what you really want to do is before you report something,
have a pretty good idea where you're at and where things may lie in terms of that impact, and be
selective about how you use a given test. Now, so one of the questions undoubtedly comes down as
you're probably questioning about a lot, the accuracy of these tests and the applications, all different
And one of the ways we can increase our test accuracy is by using multiple tests. So, one way of doing
things is to have more than one test. And the challenge there is that we're assuming tests are
independent, and that would be valid in a case such as if we were comparing a classic histopathological
lesion with maybe a serological test. In that case, we're probably independent. In many cases, there's a
concept of conditional dependence, because the tests are not completely independent.
So, what do I mean by that? Well, let's look at our friendly neighborhood germ. And let's say we're
looking at an antibody test. One test is looking for an antibody that's on this part of our friendly germ
here. Another test is looking at this. So, these tests will very likely perform differently. But are they
completely independent? Probably not. If these are integral structures that are part of every germ that
we're testing for, they really are completely independent.
So, that's one concept to keep in mind when you're evaluating your test. Another thing is how are you
going to use different tests in the field? So, one way is a parallel, meaning that you use both tests or
more than one test, more than two tests at the same time. And if one of them is positive, then you
classify that individual as being positive for that condition. And because one of two or many is positive,
for that condition, what we're doing is we're basically increasing the sensitivity of those tests or the
positive predictive value.
Actually, excuse me, just the sensitivity. Simply because if there's somebody that's positive, we want to
find them by one on the test. So, by using multiple tests, then we're increasing that sensitivity most
likely identifying those that are truly infected by the condition. Another way is to test in series, meaning,
if one test is positive, then you do another test and verify that it's positive, or maybe even a third test if
that is the case.
And what that does is that increases the specificity. And that's a nice thing. So, let's say if whatever
you're testing for is a really bad disease, and something that you're really concerned about, whether it
be anthrax or anything else. What that means is that if you get a positive, by increasing the specificity of
This transcript was exported on Nov 29, 2020 – view latest version here.
18 – Miller – Clinical Epidemiology and the Path… (Completed 11/29/20)
Transcript by
Page 6 of
that test, it increases the likelihood that you're correctly classifying that animal, or that individual, that
And you're less likely to have a false positive, which increases the specificity there. So, I promised you
that we would talk a little bit about clinical medicine and how this would be used out in the real world,
for something to think about as you are developing and marketing your test. And so, as a plug, if you
want to go back to our evidence-based medicine webinar. A big thing is what is it that we're looking at,
clarifying our questions? And that could be very fine.
So, one of the recent things that came out on Promed, one of the discussions over the last week we had
is when do you use different drugs for COVID-19. And it turns out that some are better early in disease.
And later in disease, they've canceled those clinical trials because they don't seem to be helping the
patients. So, here's some of the questions that you're going to ask. And I'm bringing up this reference for
a couple reasons.
One, I really like it. I think Sackett has a lot of great ideas that are explained very well. But also, the date
is deliberately old because what I'm throwing out, for the most part, really aren't new concepts or things
that have been around for a while, particularly for people that maybe were trained before this time, or
maybe weren't familiar with this. Sometimes, I've had a little bit of challenge getting people up to speed
on some of these more recent concepts.
But anyway, some of the questions that can be asked from a clinical epidemiologist perspective when
you are evaluating published results. One, was an inception cohort assembled? Meaning, did you
identify people early in disease with a consistent definition of disease and follow them through for a
study? Which is, much more accurate than say, a retrospective study, where you're usually going to
have a little bit more general definition of what a case definition is.
So, another question is the referral pattern. How did those patients get into that study? Were the
patients at a tertiary care facility, primary care? Was it from a wealthy community, a poor community,
things of that nature. Again, this may not be entirely intuitive. So, let's look at this graph here. I'd like for
you to look at the X axis, where we've got clinic-based studies, compared to population-based studies,
and looking at seizure incidence after an initial event of a seizure.
And you'll see that in the clinic-based studies, they're reporting that seizure recurrence is going to be a
lot higher than in the population-based studies. So, is this because these folks are being looked at more
closely? Or is it that they referred to the clinic, maybe a tertiary care facility simply because they were
more susceptible to having another seizure incident? Or is it some other explanation?
So, that's an example of how the referral pattern can be of relevance to interpreting a given study. So,
we can also look at, was complete follow-up achieve? If there were dropouts along the way, then that
can potentially influence the results, particularly if you have a small study population. And you can
obviously influence the results either way. But you really want to be able to follow people all the way
through a study.
If not, then that's going to be a little bit of a question mark, and you're going to be a little less confident
in the results of that study. And then, another thing would be, was there an objective outcome that was
being examined and reported for that study? So, how explicit was that? So, if you have something
general like GI disease, well, that could encompass a number of different clinical signs.
Whereas, let's say you are looking at something more specific like vomiting or diarrhea, well, that's
probably going to narrow the range of disease a little bit further. And in a case like that, again, you can
be able to be more confident in the results. So, as I think most people are aware of, studies that are
conducted blindly are going to be better because we have minimized the odds of biases. And then also,
the extraneous prognostic factors.
This transcript was exported on Nov 29, 2020 – view latest version here.
18 – Miller – Clinical Epidemiology and the Path… (Completed 11/29/20)
Transcript by
Page 7 of
So, let's say we're talking about a variety of signs like we did a few minutes ago, could it be that
somebody has multiple conditions such as heart disease and diabetes, or maybe COVID, and asthma,
things of that nature. Did the study account for those extra things? And so hopefully, as a part of the
clinical medicine picture, what we also want to do is use clinical epidemiology to pick the best therapy.
So, questions that we're going to have are, what is the objective of treatment? Are we looking for a
cure? Or is it something that's not curable? And what we really want is symptomatic relief, or basically, a
long-term preventative would be something that we'd be looking at with that question. In terms of
selecting the best treatment, a lot of times, that would seem to be common sense. But let's think about
it, let's say it's something that's really not causing a problem.
So, maybe somebody has a mild case of acne, and they can live with it. Do they want to spend the
money? Is it worth the benefit there? But also, compliance is an issue. If you've got something like
tuberculosis, where patients needs to be highly compliant over a long period of time. That creates a
challenge in terms of whether that's the ideal treatment. And maybe you want to pick something that's
maybe not quite as effective but gets around the compliance issue.
And then, what is the treatment target? An obvious question is, when do you stop? In ideal situation,
you're cured and you know when to stop, or you have some other sign that it's time to change course,
and pick a different therapy. So, for our conclusion, a lot like with our evidence-based medicine talk that
we gave a few weeks ago, where we're talking about, yes, there's science involved, but there's also
clinical judgment and patient values, and all those roll into what is the evidence, science part?
What are the objectives in terms of the patient values and the clinical assessment? What are the
tradeoffs, which, unfortunately, are always there? And then, for that particular value, that sweet spot in
terms of which one of these science and clinical judgment values works, overlaps with the patient values
in terms of meeting our objectives.
So, with that, I hope I didn't totally confuse you and that at least we gave you an idea that there's some
questions that you might want to think about in terms of developing or evaluating a test. And if there
are questions, that'd be great. In a way, wishing the best for everybody during these challenging times.
Thank you, Dr. Miller. So, if you have any questions, please feel free to put them in the Q&A section or
feel free to put them in the chat and I'll pitch them to Dr. Miller. I guess just to start off, how do
analytical test sensitivity and test specificity fit in with clinical epidemiology?
So, that's a good question. I probably should have gone into a little bit more in terms of comparing the
analytical values versus the clinical epidemiology. What I've seen with the test that I've evaluated is
what's reported by the lab is often the test characteristics are much more encouraging. Meaning, close
to 100% accuracy for both positives and negatives. But then, we get out in the field, the values are not
quite as good.
And in fact, in some cases, being a lot closer to 50%, which as we talked about before, is a lot like
flipping a coin. So, why would some of these things happen? Well, one would be one of the things I've
looked at have been with mycobacterial diseases in the lab, what they've compared have been the
extremes, the end case, individuals that were clearly infected, compared to the individuals that clearly
were not infected.
And so, the tests may perform well for classifying the extremes. But in the real world, what we might be
really interested in are those that are in between, those are in the early stages of infection. We're not
really clear from the clinical signs whether they're infected. And the important questions are, can we
address the infection before it progresses, and before it affects other people? So, that's one instance.
This transcript was exported on Nov 29, 2020 – view latest version here.
18 – Miller – Clinical Epidemiology and the Path… (Completed 11/29/20)
Transcript by
Page 8 of
Another example that we talked about during the webinars, also cross-reactivity. Again, a mycobacterial
example would be for tuberculosis. There's also a lot of mycobacteria that live in the soil that usually
don't cause a lot of problems. But maybe somebody is a gardener, and they're exposed to it, that might
cross-react. Or in the news with COVID, there are a lot of other coronaviruses that cause childhood
disease or gastrointestinal disease.
And the question would be for testing for COVID-19. And somebody was exposed to one of these other
coronaviruses in the past, could that cross-react and interfere with the accuracy of our test? So, those
are a few things that hopefully will clarify your thoughts on those questions.
Awesome. I guess one last question. What are your recommendations for laboratories establishing
cutoff values for their tests?
This overlaps a little bit with a previous question in terms of what we want to look at more testing is
looking at a range of clinical disease and hopefully with other conditions as well. And so, what you really
want to do is decide how you're going to use things in the field. And one of the things to think about
when you're developing a test, there's also the option of developing different cutoffs for different
So, as an example, a toxoplasmosis test that I ran once upon a time. When you look at the insert for that
test, there is one cutoff for pregnant women, and then several other cutoffs for other conditions, or
other uses. Again, getting back to the questions that I posed earlier, what's important to you for a given
individual or a given population in terms of the questions that you're trying to answer?
I'm not seeing any new questions coming in for our attendees. But if there aren't any further questions, I
guess we can end in pretty early today. But I just want to thank you again, Dr. Miller.
Well, thanks to everybody, and thanks to Samuel for keeping me out of trouble here. And I hope
everybody has a great day. And we are happy to chat with you further. If you have any other questions
in the future, you can forward to Samuel and he'll forward them on to me. So, thanks and hope you
have a great day.
Share This!