Transcript of: Statistical Analysis for IRB Protocols
Okay, just a short little organizational thing here, I’m going to start it out with an introduction, to introduce a little bit about myself, talk about Internal Review Board basics for those of you who are unfamiliar with some of the basics. Then I’m going to talk about research protocol and ethics, which most people don’t think belong with a statistics talk, but I think that’s the basis of how I learned a lot of this topic. Then I’ll go over study design, methodology, statistical methods, sample size considerations, data analysis, just the regular, typical statistics information, and I’ll conclude.
A little bit about my background, I’ve been an applied mathematician for about over 15 years, but I truly got to learn about statistical methods while I worked for the Air Force Research Lab in the Sensors Directorate, and I worked there for over six years in the RF imaging and detection division. I tested a UHF/VHF radar system on participants ages four through 75 to develop a non-Doppler biometric radar system. That was a groundbreaking thing, nobody had ever done this before. I developed the statistical protocol myself and designed the analysis.
In recent years, I’ve been teaching math and statistic coursework for scientists and engineers, and I just wanted to share some of those lessons learned with you.
A little bit about the IRB, Internal Review Board. They basically try to minimize the risk to the study participants, and make sure that the risks are reasonable. That’s in anticipation to some anticipated benefits as well. So you always have to think about the risk versus the reward, and the IRB helps you to do that. They make sure that the selection of the study participants are equitable, and they make sure to use the risk versus reward analysis to do that. They also ensure that informed consent is obtained before any study can be appropriately performed, and it needs to be documented and kept safe, the information needs to be kept safe for each participant.
There are also many monitoring rules that are in place, mostly by the FDA and monitored by the Department of Human Services. They also are tasked, this IRB board is tasked with ensuring that the principal investigators can monitor this data. Most important, I think, is that they ensure the safety of all of the study participants, and that the confidentiality and the information is protected. All this information can be found in the FDA in the link that I provided below.
Every accredited research organization usually maintains an IRB. A principal investigator needs to submit some sort of research plan, and that’s called the research protocol, and an informed consent as well. This is what’s submitted to the IRB board prior to a particular study. Some of the things that you need to consider is that you have to protect vulnerable populations, if they are part of your study, which in my study, there were, because I tested participants that were children, ages four and above. I needed to ensure that the information that I collected was kept private and confidential. I needed to assess the potential risk to each participant, especially since I was going to be using a UHF/VHF radar. It was pretty wide band, so that means that the wavelength of the radar was pretty long. I don’t know how many of you know a lot about radar, but if you have a cell phone or if you’re using Wifi, then you might know a little bit about radar.
Some of the frequencies that I was using, this UHF and VHF, they used to be used in a lot of cell phone technology, although now they’re a little bit higher. My frequencies of operation were between 800 Mhz and about 2 Ghz, so that’s around the cell phone ranges for frequencies. I tried to ensure to keep the entire bandwidth, and I’ll talk a little bit about why I made that choice in a little bit.
The IRB also monitors the study design and they ensure that the researcher performs a risk to benefit analysis that’s, again, really important.
Some of the questions that are commonly asked when you’re in front of the IRB is, is the science that you’re going to be learning from your study, is it going to be significant enough to be worth the risk, essentially? Because you’re going to be testing on all of these participants, so is this research going to be worth the risk? How is the risk measured? Do you need to use vulnerable populations? In my study, we were able to use vulnerable populations, but it took about two years just to figure out if we could. Is your study a minimal risk? That’s something that’s really important if you’re going to be engaging in a short study. And what are the most important practices?
These are some of the common questions that I get asked as an educator and as a researcher about the IRB. Some of the lessons learned are just to remember that you’re an expert in your research. But it is your job to communicate your research goals and the benefits of your research to the IRB board. The only person that can do that is you, and so you need to ensure that the IRB understands everything that you’re trying to communicate. Whenever you’re talking to the IRB board, you can’t see them as the enemy. You need to take good notes and you need to revisit these notes often. You also need to remember that not every study is the same, so you need to find similarities of your work to other studies, because sometimes they exist, and the IRB will definitely be looking for that. Organization and timelines are really important. You can tell when a researcher’s not organized, or if they haven’t considered their timelines once you’re being interviewed by the IRB board, they can tell pretty quickly if you haven’t done your research. I think organization and timelines are key.
I have here a few images to depict some of the reasons why the IRB is important. Here we have the Nuremberg Trials. They’re famous for a lot of experiments that happened in Nazi Germany, where they definitely did not follow the rules of ethics that we’re following now. That’s an extreme example, but even recently, this triplet study as well, and you can click on the link, I have linked it to the article. In our history, as well, even, this story was done in the 80s, so even pretty fairly recently we’ve engaged in some questionable research ourselves. We want to remember that ethics is really important and we need to consider ethics in everything that we do, and that includes considering ethics in even something mathematical like statistics. In my opinion, especially so. And I’m a mathematician, so … I’m probably biased.
I created this little slide basically because there are a few things that we forget about our respective fields. For example, whenever we think of ourselves, we automatically assume professional competence. But we have to remember that we’re not experts at everything, and research isn’t done in a vacuum, it’s done in something that requires an interdisciplinary team. For me it was really easy because I knew that my research would be interdisciplinary. I had been studying the work for a really long time, so I understood that I was not an expert in mostly everything that I was going to be studying. I was an expert in only one tiny little piece of the work. But I was interested in the research enough that I could understand that I wasn’t an expert in everything and that I needed help.
Professional competence includes that. Understanding that you may be good at one particular thing, but you will need help for other things. Objectivity: objectivity goes hand in hand with human rights, especially in an IRB study … an IRB protocol. We’re going to talk a little bit more about that later, but what is really connected with those two things is, will my work advance science? Human rights, objectivity is kind of tied to that, will my work … oops, sorry, will my work advance science?
We have also facts versus theory, we have to consider those. We have research aims and we have process transparency. Those are other things that the IRB board looks at. But all of these little bullet points here, I actually got from the American Statistical Society. This was something that was prepared for statisticians. Even though statistics is definitely a mathematical field, we always have to remember that it ties us to all of these bullet points that I’ve named here, and whenever you’re engaging in any study, whether it’s for the IRB or not, you always have to remember all of these particular points. The underlying question for all of these points is, will my work advance science?
Because of that, some of the lessons I learned from that is, know the IRB members and their roles, because each IRB member has a particular role to play. There could be a statistician in the actually IRB board, but they’re not required by the FDA. If you do have an IRB statistician in the board, that’s great. But if you don’t, then it’s going to take a lot for you to convince some of the IRB board members to see if your study is going to pass the test.
Again, I wrote this bullet point again, you need to remember that the IRB is not there to, they don’t want you to fail, but they want you to consider every single aspect of the study. Take the IRB member input in the best light and do not be defensive. If anything starts out by being defensive, then that’s going to be a delay, and that’s not really going to solve anything. IRB members are there to help you, so you really should listen to their advice.
Also, do your research about past protocols, especially when you’re dealing with interdisciplinary work. But even when you’re not, because there might be other research scientists who have tried to work through protocols that maybe didn’t meet basic qualifications like sample size. By contacting them, you never know, you might be able to, together, work out protocol where sample size determination is allowed.
You want to make sure that you follow all instructions for the IRB board suggests, otherwise you’re going to risk a lot of delays and nobody wants that.
Again, data security is a very important job of a protocol, and statistical design can help with that. You can randomize subject IDs, you can design a coding system that guarantees personal identifiable information if safe. And even the collection methodology can protect every participant and ensure the data transfer is going to be safe and confidential. Always remember that, even if you’re not a statistician, if you are going to be engaging in an IRB protocol, somebody who has statistics experience can help you randomize and protect your data.
You want to also remember that, for every participant, you need to educate them about your study. That’s part of the informed consent. And we want to ensure that everything is understandable for them. I know that the rule of thumb at the IRB board at AFRL was, they told us to write at the level of eighth grade education. But for my study, that wouldn’t be okay. Some of the rules of thumbs may not work for you, so make sure that even if there are suggestions, particularly when it’s to inform participants that might not work for your study, that you voice them clearly, but not in a negative way.
Again, I’ve talked about confidentiality, working with statisticians to randomize the data collection. Don’t collect information just in case you need it later. Never do that. That is probably something that I learned right away, because statistical sample size determination was one of the first issues that I had. I had a student from my study that was working on her thesis, and I wanted to be able to collect as much data as I could so that we could use data for her study for her master’s thesis, but then also data for my work. I thought that I could combine everything and do it all at once, and I felt that that would be more efficient.
But according to the IRB, that was probably not the best use of the study. You want to work in interdisciplinary teams to ensure that you’re using the data correctly. That was one of my first lessons learned, is to not try to do everything at once because that may also automatically disqualify your study, number one, but also you need to remember the whole aspect of a timeline. If you’re trying to be too rushed, you want to try to figure out why that is.
We ended up making a very small study for my student first. We figured out if, that became out feasibility study, and then later on we were able to add more participants. Again, sometimes when you’re new, you want to do everything all at once just because that’s how you’ve been taught to do things in a different field. But with a study involving human participants, you really don’t want to do that.
Now this leads me to teams. Who can really help you with your methodology? Do you have the right equipment, do you have the right staff, do you have the right timeline? You know what you need, you know that you need members with secure qualification, you know that they need to be individuals that can maintain confidentiality, so integrity is going to be an issue. If you’re going to be working with agencies like the DOD, you’re going to need things like clearance. If you have sensitive information or areas where you’re going to be working in that maybe requires a particular clearance, then you’re going to need to think about these things.
This should always be done before even starting your protocol, because assembling the right team can help you create that protocol. It shouldn’t be done the other way around. One of the other lessons that I learned in this very long study was that I didn’t have the medical experience that I needed. I was working with the NIH so I thought that was sufficient, I was working with a lot of medical researchers. But because of the vulnerable population that I had, I needed to have a medical monitor. That meant that somebody needed to be there who had medical experience with anything having to do with dosimetric measurements, anything having to do with radiometric measurements. I was able to find a medical doctor right away, but I really didn’t think that I needed that. It was because I was working with so many individuals who were basically medical researchers, but they didn’t have the MDs next to their name. Sometimes, you really do need those superior qualifications, those will always be best. That’s another one of the lessons learned that I had.
As you can tell, there’s no one size fits all. Even if you try to overestimate everything and try to cover everything all at once, there might be something you’re doing wrong there. Or if you’re too cautious, there might be something that you’re doing wrong there. You have to be able to, again, go back and figure out if your goals have been reached by others the right way. Figure out if others have made mistakes beforehand.
That’s where the statistics comes in, what is your population? Who needs you study? Who’s going to be helped by your study? Who will it help? Why can it be done better with you as the principal investigator and not with somebody else? Why can’t it be done better by just doing a simulation, for example, and not risking any study subjects? What are going to be the outcomes? How much time will this take? Again, timeline seems to be a really important thing, and it is.
If I can get this to work … there it is.
I drew out here a data map, but I didn’t really draw any of the data. I didn’t really draw, sorry, any of the arrows. You have descriptive data. If you have descriptive data then you can think about things like case studies, individual case studies, cross sectional data, longitudinal data, ecological data, retrospective or prospective data. If you’re thinking about analytical data, then you can break this apart into observational or experimental. With observational, again, you can have cross sectional data, case controlled data, cohort data, longitudinal data.
If you’re not too familiar with some of these terms, cross sectional data means that you’re not considering time, you’re considering everything in non temporal terms. You’re not measuring the time when you’re thinking about the data. Not considering time. I’m not the best hand writer, cause I’m using my computer, unfortunately my iPad won’t let me do this. Not based on time. Not measuring time. Okay.
Longitudinal data means that you’re observing a particular subject over time. Ecological data, of course … this is over time. Ecological data, oops sorry. Not as fast writing this with my computer as I normally am. Ecological data is basically, I don’t want to say that they think that they’re probably the best for descriptive data, but I have heard that they have thought, ecological research studies probably have designed some of the best descriptive data study designs. I would say if you have a descriptive study to look at some ecological studies.
There’s also retrospective or prospective studies. These are all the type of studies, if you’re going to have some sort of descriptive studies. If you’re going to have an observational, then you would either look at cross sectional case control studies and cohort studies are more along the lines of medical studies. Again, we have the longitudinal study designs, which I was also able to engage in with my study, which we’ll talk about in a little bit. Then we have experimental, which has randomized and non-randomized. You can find out a lot of information out there.
These are the different types of data that you might have, so these are the different types of studies that you can research in the literature. Remember, you never know where you might get your inspiration, especially with interdisciplinary research. It’s always good to always look through every type of study. Because again, there is no one study design that will fit all for human subject studies.
Another very popular study design to look through in the literature would be something that has a clever acronym called, I don’t know how you say that acronym unfortunately. PICOT, because this is very medical based. The P means patient population or problem, the I is issue intervention, the C is comparison, the O is outcome, and the T is time frame. Again, there goes that time variable again, or that timeline, sorry, not time variable, timeline issue again.
With everything having to do with study design, and with thinking about the type of study design that you’re going to choose, if it’s going to be descriptive, or if it’s going to be observational, experimental, you want to ensure the IRB knows that your primary goal is protecting the participant. That goal is hand in hand with the science. You want to remember to think about the participant first. They are not data, and never, never call them that. Don’t see your study participants as data. Any time you even let that slip, that might make you seem like a Frankenstein doctor, unfortunately. Always remember the participant, how will this affect the participant? Even when you’re designing your statistical study, and that’s something that I always tech my students whenever they’re designing projects themselves is, the participant comes first.
Think about how many scenarios, what ifs, that are important, but this can be kind of difficult for new PIs in particular because they might not know the what ifs. Those are the confounding variables. That becomes the issue or the intervention part. I would add in something like confounding variable or unexpected variables here. That’s how I would modify this.
You always want to think about new scenarios and what can go wrong. Ask for help, again, and that goes with the comparison part. And brainstorm with team members, that goes with the outcome part. Time frame: organize, organize, organize. Ensure that everybody in your team has the same goals that you have to make sure that you’re going to be ensuring the participant safety, because that’s the top thing that you need to ensure.
Just a little bit of the federal regulations, just because working with the IRB you need to know the federal regulations, under title 45: study approval is subject to variation, and then interpretation of federal regulations at each site, in addition to many other potential factors including investigator related characteristics, institutional standards, IRB structure and function, clinical expertise and reviewing community, and individual ethical and methodological standards.
Maybe some people with IRBs or with experience with IRBs are scratching their heads and saying “That’s never come up in any of my protocols.” When I had my protocol, of course it was testing with vulnerable populations, so it was scrutinized to the extreme I think, in my opinion. It took two years just to get some of the basics approved. On top of that, it was the perception. Something that I learned is, perception is everything with the IRB, particularly if you’re dealing with an IRB in the department of defense. Perception is everything. If you appear to be nonchalant by your study subjects, if you call them “data”, then that might, to them, seem as if there’s something wrong with you as an investigator. Then that biases them, that biases the IRB.
The DOD also has to consider things like security clearances. My particular study was done in an IRB range. Sorry, in an RF anechoic range. It was a pretty large study site, but it was in a secure building. We had to make sure that everybody had easy access to the range, but wasn’t going to be subject to seeing anything that they couldn’t see. There are going to be some institutional standards that you might not normally have to deal with if you’re, lets say, in an educational institution and you’re dealing with an educational institution IRB.
Of course, the medical aspects. Even at that point in time, when my research was done, which, it started out 2010, I knew that there were children that were using cell phones and I knew that cell phones were radars in the frequencies that I was testing, and I also knew that the anechoic chamber that I was testing in was rated for frequencies of that frequency, so that means that the only amount of electromagnetic radiation that my study subjects would be, everybody inside the anechoic chamber had to shut off their devices, the only thing that was active in my study was my radar, and it was much, much less the power of a cell phone.
I wanted to write in my study informed consent that the study participants would actually be getting less radiation than they would normally get. Because in a common household, you had cell phones operating at all times, you had Wifi, you had, even a wireless phone, a landline phone could emit more power than what we were emitting with our radars. I thought, well, why can’t we say that? Well because, they said, it hasn’t been proven by numerous studies, which was true. I was able to dig up one study, but they said that that was only one study done by one institutional IRB that they felt didn’t have the right standards to what the DOD IRB standards upheld.
Sometimes, even if there is evidence that you’re helping participants, you have to convince your IRB that your standards are up to par. Unfortunately, any IRB has that right. Especially if you’re dealing with vulnerable populations, and there’s nothing you can do about it. The best thing to do is to work with them. And it ended up that, just in that work that I was able to do with them, not only did I learn a lot, but I was able to inform a lot of individuals about the importance of the study. All of that was through communication, and that’s how I was able to finally get it approved.
Now comes the part that was the most difficult. Convincing the IRB members who had, I think, somebody that knew statistics but he was a physicist, so he saw statistics in a different way. This is where I think that I didn’t have enough experience at the time to be able to show the IRB that my particular methods were adequate, and that I had enough experience to be able to understand the data. One of the things that I actually was able to use was this Anscombe’s quartet. Does your study involve just one variable, or are there multiple variables, and mine had 1,053 variables. How do you visualize all those variables, everybody thought, and why is it that you need so many? If you have all these variables, why do you need so many study participants? Why isn’t it better to actually work with something that is a longitudinal study instead, first, and then go backwards?
It’s true, those were good questions. Different methods can apply in every single case. There are test stats for single variables to see if a particular probability model fits the data, or whether a particular population parameter equals a specified value, if there’s tests for different variables for two or three or four. You have to think about if the variables are categorical, numerical, if they’re mixed. What is it about your data that is unique, what is the homogeneity of it? When you have so many variables, how can you visualize it to the point where you can convince somebody else that you need the additional participants? Because everybody thinks, well maybe you haven’t thought it through, maybe you just need five or six.
This is a little bit more about my research. This is where a little bit of my mathematical experience comes in. Because what I had to make them understand is that my data was paired. There are different methods for paired data. Normally, with paired data, once you have three pairs, that’s difficult enough, so now you have so many more, are you just using PCA and what does that really say, are you just using SVM, all of these methods that are commonly used for dimensionality reduction, are you using too many dimensions? What does it matter?
I had to convince that this is what I needed. I had to convince the IRB that this was what I needed. I was able to design a simplistic model, and that’s where the dosimetry came in. Back in the 80s, it turned out, for my study in particular, somebody had cut up a cadaver, this was, I think, in the army. They cut up a cadaver and they dosed it with electromagnetic radiation, and they figured out the electric permittivity values of all of the different tissues in the body. When you have all of this information, I was able to explain to them, then you wanted to use it effectively. In order to be able to recover some of that information, I would need to find a system of equations and that’s why I needed so many study participants because I needed enough equations for the unknowns that I had. That’s how I was able to finally get them on board, that I needed more than five participants.
There are many powerful tests out there, but how can you teach 10 years of mathematics and statistics training to somebody who is a different-or, math and statistics and physics training to somebody who went a different direction? The best way was to meet people halfway. I was able to find that study for the symmetry studies in the 80s, by knowing a medical doctor and working with a medical doctor who turned into a physicist later on, an RF radar physicist. He was able to help me turn this into a simple elevator pitch so that I can turn this study around and get the subjects that I needed.
A lot of time, even with these statistical methods that you may know inside and out, how do you deliver that information to other people? Again, it’s communication and working with the right team.
That leads me to sample size consideration. We have a lot of trade offs, a large sample size is needed to achieve this high preciseness score, confidence interval, a particular width of a confidence interval, but you might be missing the mark. Are there these type I errors, type II errors to consider, bias, accuracy considerations, are there confounding variables that you haven’t considered, have you planned for the data loss, have you planned for the time constraints to collect the data? Remember that this trade off for sample size seems to be something that’s underlying, aside from the timeline, sample size seems to be one of the biggest things that we’re trading off here with. At least that’s what I found.
Some of the lessons that I learned then was to make sure that you have a simple method for the sample size determination. To boil it down into something like I was able to do, there’s most likely somebody that can help you in the IRB board, but if not you’re just going to have to find somebody outside of the IRB. Remember that there is a difference between qualitative data and quantitative data research. Different IRB members are going to have different research experiences that you may not understand, so you need to figure out a way to be on the same page. Always want to think about that trade off, but then also underlying that, you always want to make sure that the study goals that you’re thinking about are a benefit to the society, and to always say that to the IRB board.
I think I’m probably going over time here. Some common data analysis methods, here we have, I made lists, make sure that you’re not trying any other methods, make sure to map the analysis methods that can be useful. Don’t oversell the analysis because it could change right in the middle of the project. Always have a backup plan and know your backup methods.
I listed some common methods that you can, especially some parametric and nonparametric methods for quantitative data, some for categorical data, and I listed on the side there some of the research fields that you might find them in. If you need something to read to get a sense of how to use them. But always remember that there’s going to be unexpected circumstances, recruitment issues, time delays, unexpected cancellations, research funding shortcomings. Understand that it’s not, you do need to establish a minimum and a maximum for your sample size considerations, and you’re going to need to know if you have a sample size that’s small, you’re going to use a different test than you would if you have a sample size that’s large. Remember to plan for those and show the IRB that you’re planning for those.
For qualitative research, it’s fundamentally different than quantitative research, but also know that mixed data methods and mixed methods research is becoming more sophisticated. If you’re going to be venturing in that route, you’re going to need active researchers in that field for support, because it’s changing. I myself was engaged in an educational study just recently where everything was changing almost daily. I was finding new papers. It’s a very active field. You need to know your community, I think, in those fields, whenever you’re engaging in those type of studies.
My overall lessons learned, before engaging in any human subject testing, know if your research is worth the risk. Does your study need vulnerable populations? This should only be considered for minimal risk studies or high impact studies. In mine, I was able ton convince my IRB that it was a high impact study, and it was able to be successful. We were just given the patent recently. How many subjects are really needed? Don’t forget to think about bias, preciseness and risk. Data analysis requires planning, organization, communication, and constant research. Don’t forget about finding the right team members and performing effective training. That’s basically it. I think I probably went a little bit over, but hopefully we still have some time to answer questions.
Thank you so much Dr. Miranda for the informative talks. I just had a few questions I wanted to pitch to you.
How long was the IRB approval process for your project, and is this a common time frame for protocols with vulnerable populations?
The total time, I mentioned different times, the total time was actually five years. That’s a really long time, that’s not normal. But I did have vulnerable populations. Initially, the initial first study did only take two years, and I think that that was because I didn’t have enough experience, and I was trying to cover too much all at once, and the IRB didn’t like that.
I think that this isn’t typical for normal situations. Again, this was in the DOD IRB, so because of that, the labs had … every month they had a whole big old long list of people that they had to cover. I think that was another issue that our time got delayed. It wasn’t all me, but it was just a whole bunch of different factors. But I think you should plan on, if it’s something that involves vulnerable population, you should plan on about a year or two.
For our next question, how many, since you said you talked about, you didn’t have enough experience I guess, how many years experience did you have before developing the statistical methods for your study?
That probably took about three years to develop the particular methods. That was because I had to get so much input. I didn’t have the right experience, I was so young, getting my PhD, and I was awarded all this funding right away, so I started running with it, so it did take me about a good three years to develop the methods. Not necessarily to develop the methods myself, because I was using a lot of methods that are already in existence, but it was figuring out how to communicate to the IRB that my methods would work. I think that’s what took the longest time.
Do you think that someone with little to no statistical experience can design a large scale study without an experienced statistician?
I would say probably not. I know that I wasn’t able to do it. I had to get a lot of help. I was lucky in that I had a lot of support. I had a lot of people on my team who were right there in helping me, I think that that’s probably why it was successful and it was a really large study and I’m glad that I was able to get the help that I did. But I think that an experienced statistician would have probably cut off a few years for the study.
How is a research monitor selected, and is this typical for any study?
I had to have a medical research monitor, because all of my collaborators were non-practicing medical doctors. Even though I had a medical doctor on staff, he became a physicist, so he no longer kept his medical license. Because he no longer, he was 73, so because he no longer had his medical license, then I couldn’t use him as the medical monitor.
Usually, if there’s any indication that you don’t know something by the IRB, then they will suggest a monitor. I was able to figure out the math, and I think that I had the background enough for the math and the physics and the radar and with my team, but I think that the medical monitor required somebody that was really knowledgeable, and I just didn’t have that. If you don’t have somebody on your team that has superior credentials, they’re not going to be, you’re going to need to hire somebody else, basically. That’s what the monitor is.
I guess that’s all we had for our questions. Before we end it off, did you have any last remarks to give to all our attendees?
No, just … what I always remind my students is to always remember that your study subjects are not data, and that you need to always consider their needs before your study.