false
Catalog
Going to PROM: Enhancing Value-Based Care with Pat ...
Session Presentation
Session Presentation
Back to course
[Please upgrade your browser to play this video content]
Video Transcription
a hopefully where you are beautiful Saturday afternoon for the annual assembly here. I'm thrilled to be sort of moderating this session on something cross-cutting across, you know, what, you know, the myriad disciplines of our field, and something that I think the Academy and a lot of researchers are taking a bit serious right now, which is a great thing, which is patient-reported outcome measures. We're going to, all three of us, talk in order, and then have questions at the end. And I'm going to introduce our speakers now just for the sake of better flow. So our first speaker in going to prom, enhancing value-based care with patient-reported outcomes, is Sarah Park. And Dr. Park is an assistant professor at the Mayo Clinic in Scottsdale, Arizona. She's also the Cancer Rehabilitation Community Network Research Chair, where she's going to be leading up efforts to enhance, amongst other things, patient-reported outcome gathering and use for improvement of the field. She's been working with a group of cancer rehabilitation physicians as well on how to integrate common data elements and proms into practice so that people can share that data and use it for larger research. Following her is Dr. Andrea Chaville. She's a professor at the Mayo Clinic in Rochester. She's currently a PI on two population-level pragmatic clinical trials that use PRO to screen and guide functionally, I'm sorry, functionally directed care. She's also the editor of a forthcoming archives of physical medicine rehabilitation supplement about the SAMCAT, which is an NIH funded effort to use patient-reported outcome measures for hospital-based rehabilitation needs assessment and quality care. She's also conducted other clinical trials and published enormously on this topic. And finally, I'll be the third speaker. I'm Sean Smith from the University of Michigan, where I'm an associate professor. My research in part is focused on development and implementation of patient-reported outcome measures, and I'll be talking about the implementation into the electronic health record. And so with that, again, if you have questions, feel free to put them in the chat, but we'll most likely address them at the end. I'm going to turn it over to Dr. Park to kick us off. Wonderful. Thank you so much, Sean, for the introduction and for the opportunity to speak on this important topic. So, all the mandatory disclosures slide, we have none. So, it's my pleasure to introduce you to outcome measures generally, and then Dr. Smith and Dr. Seville will get into the meat of the presentation with discussion of patient-reported outcome measures. We'll start out with just a discussion of reasons to measure these outcomes and then define the different types of outcome measurement tools, recognize a few different types of outcome measurement tools, recognize a few examples of these, and then talk about the pros and cons. So, what are outcome measures? Outcome measures reflect the impact of the healthcare service on intervention on the health status of the patient. This is the definition from the Agency for Healthcare Research and Quality. Why are these important? Well, predominantly they help us know if our intervention works. This reduces bias, helps us to standardize treatment, and this helps, additionally, with clinical decision making. Finally, payers may require it, and this also helps sometimes with research dollars, and Dr. Smith will really get into this part. This is a big slide, but I just wanted you to know about all the different types and some examples of outcome measurement tools. Really, there are four main categories. Objective measures, which can be broken down into cognitive or physical measures. There's also a new category of really wearables and trackers that are coming out. Super exciting stuff there. Clinician-reported outcomes. As Dr. Smith indicated, I am in the cancer rehabilitation world, so I think both about rehab measurements and oncology-based measurements. So, here's a few common examples. Recently, of course, we've been using the ANPAC six-click a lot, and FIMS scores are integral. In oncology, really, the KPS and ECOG are some of the most important clinician-reported outcomes. Caregiver or proxy is another category. This is, I think, a relatively underutilized category, but one that's of great importance both in rehabilitation medicine and in oncology. Generally speaking, looks at burden and quality of life for the caregiver and helps to inform our clinical practice and interventions here as well. This is a list of just some of the more common, but of course, there's many more than this. And then finally, patient-reported outcomes. There are just too many to list here, but this is a few examples for you. So, how do we choose a measure with all these choices? I really think about the who, what, when, where question, and so who will care about this measure? Is this for clinical or research purposes? Is the outcome measure I'm choosing validated for the population that I hope to target? Who can perform it? And when and where can it occur? And what are my limitations in that domain? So, let's talk about the pros and cons of these different category types. So, objective measures. As we saw listed there, there's a wide range of metrics. I hardly even hit the tip of the iceberg with the list provided. These are reproducible, often sophisticated. They have high inter-rater reliability. Usually do not require an expert because it's a standardized tool. The downside, of course, they're expensive and time-consuming and they must be performed the same way every time for true reliability. And they don't always capture the problem that we're going for. So, we have to be very careful in terms of picking these. Finally, there can be some inequities highlighted here. And I'll actually put this as one of the cons of every measure, is that you can have some language barriers, cost, cultural concerns that impact the reliability of these measures and the utility of them. With clinician-reported outcomes, I would argue that the pro and con are kind of the same. So, the pro is that it's administered by experts and that it offers a common language for the assessors. Again, they're sophisticated and have high inter-rater reliability and you don't need as many measures relative to the patient number. The downside, similarly, because it's administered by the clinician or the expert, it's limited to the clinical environment, can be expensive and time-consuming for the clinician. Also may be biased just based on the clinician's experience level and perception of the patients. And then the inequities, of course, are there. For caregiver-reported measures, this is a really intimate view of the patient experience and it captures the unique personal perspective that cannot be observed by the outsider. There's a lot of different ways to administer these types of tools, which is awesome. And they can be delivered in different settings, which really sets them apart. Downsides here, again, may be biased depending on the caregiver's environment, mood, you know, can change on a whim based on maybe what they just experienced as a caregiver relationship to the patient. It can be time-consuming for patient caregivers that are already burdened. And then inequities here, I think, are also really important, especially those language barriers, sometimes cultural things, just, you know, the relationship between the caregiver and the patient and what, you know, their culture allows for disclosing. So finally, to get into patient-reported outcome measures, which is I know the reason why you're all here. And the U.S. Food and Drug Administration really defines this as a measurement of any aspects of the patient's health status that comes directly from the patient. So without anyone else's view. These are really important because they capture data that cannot be observed. Symptoms that are not obvious to observers like fatigue or headaches, psychological symptoms like depression, anxiety, maybe things that are too intimate to be observed by an outsider, toileting, sleep, the frequency and severity of symptoms, their impact on daily life. And then, of course, the patient's perceptions regarding the symptom or the treatment pathway that can really impact, you know, how we might perform clinically and in research. So with that, you know, introduction, it's my pleasure to introduce Dr. Seville. She'll be sharing with us all of her wealth of knowledge and experience on the value of modern patient-reported outcomes. Thank you, Sarah. That was a lovely, lovely introduction. I'm going to share my screen. It's a real pleasure to be with everybody and I'll try to convince you of the value of modern or current generation patient-reported outcome measures to rehabilitation medicine with an emphasis on IRT or item response theory-derived measures, computed active testing, and and the PROMIS collection as well as other IRT-modeled item banks. So brief overview, we'll talk quickly about latent constructs. I'll explain why function is in many ways atypical PROM or PRO. We'll distinguish legacy measures from IRT-derived measures, talk about computed active testing as well as multidimensional computed active testing, touch on PROMIS, and elaborate a little more. There'll be some repetition from Sarah's slides, but I don't know that that's all bad, but conclude with which PROM should I use. So the problem, and this is so beautifully stated, as we seek to provide better value in healthcare, we will succeed only if we can define what constitutes a good outcome. Despite the dizzying proliferation of clinical performance measures, these often miss the goal because we've come to respect the fact that patients are the ultimate arbiter of whether an outcome is good or bad, and their experience in large part is, or the dimensions of their experience, are latent constructs, and these are constructs that cannot be truly measured. And much of the science, this is referring specifically to psychometrics, but much of the research and science in this discipline has developed in an attempt to measure these constructs, often subjective, as close to the true score as possible, and a very, very brief history lesson in this, which I think is useful. The science of psychometrics originated in the 19th century in efforts to quantify intelligence. Then in the first half of the 20th century, that morphed into a more mature discipline, sub-discipline of psychology, and trying to objectively characterize personality, attitudes, aptitudes. Much of that was interestingly spurred by wars in which there were huge numbers of recruits having to be binned and sorted in a very rapid way, and some of these tests, and actually this was the genesis that computed after testing, it began as a consequence of the military's need to very rapidly and efficiently sort people based on these traits. So then in the latter part of the 20th century, we get into health care, and these measures were initially used at the population level. It was only, and I think we can truly credit the PHQ-9 with this, the depression measure, with shifting the focus from population level measurement to individual patient level measurement with the intention to diagnose and inform treatment. And so since the advent and widespread adoption of the PHQ-9, we've seen the true proliferation that Sarah alluded to. Well, how do we measure these latent constructs? So the easiest way, which is always nice, if we can find some tightly related objective construct or domain that we can measure, and sometimes these are referred to as proxies or surrogate measure, they work great if the objective domain is tightly and reliably correlated with the latent or subjective domain. And we'll get into this in function in a minute, but function actually falls into this category because they're obviously objectively quantifiable dimensions of function that largely do track with patient's experience. So we certainly can use gait speed in some contexts. If you lack a compelling single domain, the method, the most common method, is to estimate the covariance and variance of a collection of related constructs, and the covariance estimate is taken to represent the magnitude, and that's the whole science of latent variable estimation. Now we'll get into PROMs, and when you don't have objective options, you ask a lot of questions or you use rating scales. And I'm going to touch on rating scales very quickly before moving away to more conventional PROMs, but these work. They have phenomenal face validity, problems with numeracy, problems with end-diversion bias, but they are fabulous screening tools. And so if we get into the difference between rating scales, single-item rating scales, and multi-item assessment tools, rating scales often numerical, but they certainly can be adjectival. As I said, very brief, low respondent burden, great for screening tools that can pick up a crude level of signal that then can be further estimated with multi-item instruments. Multi-item instruments are often preferred because of comprehensive trait coverage, and this is very important when you have subdomains. And I've listed the example of depression, where we have both physical manifestations of the depressions, but they're also psycho-emotional manifestations, and both are true. And subdomains become very important in function because gait, stair climbing, ADLs, many, many important subdomains. And it's one reason why multi-item assessments many times are preferred. So Sarah did a beautiful job of outlining what are our options for assessing functional constructs. Function is a broad, complex construct with many domains and a wide performance range. We're physiatrists, so we understand that function can range from someone who has essentially control of their eye movements and not more, to the ultra-elite athletes. And function has important objective and subjective dimensions. What a patient thinks they can do and the way they report feeling doing it is often more predictive of what they can do. This is a very busy slide. It's actually taken from an in-press paper that outlines in these different approaches the pros and cons, but I just wanted to quickly run through pros. Often they're really cheap. They can be dirt cheap, capture subjective experience, an exhaustively wide range of administration modes. We can have SMS text, smartphones, computers, interactive voice response, print, can be multidimensional. We can use computerized assessments, high respondent burden. They only work if patients fill them out. And I'll talk a little bit about low response rates later on. And as Sarah nicely highlighted, variable performance in demographic groups. So legacy measures, and this is where we truly bifurcate into the old school measures and the newer generation. Legacy measures, you'll be familiar with many of them. The short form 36 is a general quality of life measure, the Oswestry disability index, a measure of function and pain, PHQ-9 mood, but truly an exhaustive collection, all derived through classical test theory, all fixed link. They comprise both rating scales and multi-item scales. If you think of the brief pain inventory, the brief fatigue inventory, many of them have tremendous name recognition. And in research, that can be a very useful thing. Easy to get your grant funded, easier to get your paper published. There are some problems. They must be administered, usually in total. There are some that have been validated for missing items, but preferentially completely filled out. They must be in the same order, same response options, and violations of those requirements threatens validity. They have high discrimination at the mean, much lower at times at the low and high end of the trait range, which for us as physiatrists can be a problem because we are interested in low, sometimes if you're in sports, the high performers in function and disabled populations of lower ends. And they're also prone to ceiling effects. This is actual data gathered from a cancer cohort who administered the FEMM mobility subscale. And what you can see is it's quite linear in these lower ranges, but all of a sudden you have this big blob and actually this line at 35, which is the top score. This is a classic ceiling effect. Many of these patients probably have higher functioning, but that's where the scale stops. And that's very problematic in a clinical trial because you will not be able to show benefit beyond the ceiling. Also problematic in objective assessments, particularly of individuals or departments performance, because if you're really helping people, ironically, you're penalized because you've hit the ceiling. And the same converse is true at the floor. So inter-item response theory, and why is this gain traction? Greater precision efficiency, it yields continuous scores or ratio data from zero to infinity. And that has very important statistical implications. I won't say more powerful, but select statistical tools that reduce sample size, make it easier to detect difference in effect size in clinical trials. So it lessens your sample requirements. From item banks, usually IRT is applied to item banks. Some of the promise banks have almost 200 items in them, for example, and it doesn't matter which items you pick or how many items a patient fails to respond to, you can still yield a comparable score. And that is why NIH has embraced this, and IRT is a fundamental characteristic of the PROMIS initiative, as well as the NIH toolbox. So the IRT modeling places test takers and the trait range, in this case patient function, on the same unidimensional continuum. And essentially a logistic regression equation is used to do this. Without going tremendously more into IRT, I'll share that there are now quite a few banks, and I believe this slide is available to everybody. I've included the references at the end of the slides. We have a growing number of very high-performing IRT-modeled banks that are very relevant to rehab populations. NeuroQOL, SCIQOL, TBIQOL, CP, CAT, again, all IRT-modeled banks that target specific rehab populations. So what is a CAT? A CAT is a computerized adaptive test, effectively an algorithm used to, based on a respondent's prior answers, what is likely to be the next most informative item in a bank. And CAT, the yield of using CAT is directly proportional to the number of items in your item bank. The bigger the bank, the more precise the trait estimate that CAT will be able to produce. So if you think about some of these certification exams, IRT-based assessment is routine now, and certification licensing exams, academic assessments, they may have thousands of items. In healthcare banks currently, rarely do banks have any more than 200 or 300. That's the very top, and those are rare. But anyway, it's a computerized algorithm that, based on an individual's answers, is picking the next most useful item. Flexible stopping criteria, you can stop based on number of items, and precision of estimate. A CAT will simultaneously estimate not only the trait, how much function you have, but also an error measure. So many times the determination is met when that, you can think of it as a confidence interval, falls below a certain range. So this is, in a nutshell, how it works. This would be the first question, the one at the very top. They get it right, so they get a harder question. They get that one wrong, they get an easier question, and very quickly, the test hones in on that respondent's score. My husband describes it as Civil War artillery targeting. Apparently, they would fire to the left of them, fire to the right of them, recalibrate, and fire right up the middle. So CAT is very much like that. And what does it buy you? It's not a trivial lift to administer one of these tests via a computer. Why would we do this? And this slide nicely summarizes it. The standard error of measurement, meaning how good is your thermometer working, or your sphygmomanometer? How precise is your measurement tool? You can think of that here as your y-axis, the standard error of measurement. And the trait range, actually in this case, it's looking at fatigue measures, so no fatigue all the way up to severe fatigue. Invariably, population responses mapped like this will be parabolic, will have a roughly parabolic distribution. But what I want to call your attention to is this kind of lime green item. This is a four-item fixed-length legacy measure, the SF36. And you can see, if you look at this top line, there's a point, it seems to happen at about 1.5, little bit before, where the error starts to go up exponentially. Does well in the middle at the mean, and same phenomenon, a much more gradual exponential increase in the lower end. When you use a CAT to administer the same items, overall, your error is reduced, but that uptick, still a parabolic distribution, but that uptick occurs much later. Again, think of high-performing respondents and low-performing respondents. This is where a CAT buys you important precision. So then the FUSIA is the 13-item facet fatigue inventory fixed-length. It's actually a legacy measure. And what happens when we administer the same items using a CAT? And here we go. Flatter. Oh, you know, I apologize. These CATs are not, they're not the same items. That's my misnomer. This would be the same items from the same domain, but not the same items as in the legacy. But look at what you've got here. Overall, less error, but much delayed uptick at the ends. So that's where a CAT shines, truly, is in the high-low ends of the trait range. And this is, these are actually active AMPAC scores, which is the, where the six clicks is derived from. And if you look at the basic mobility score, the uptick occurs here, data from cancer populations. And this is the level of a high-level community ambulator. So it's not, it's unfortunate because we still see this uptick. So CATs do not eliminate the problem, but they lessen it. And this is actually the daily activity score. And again, a much lower uptick. One of the nice things here is by estimating the standard error of measurement, we can gauge how much credibility to put in the trait estimate. Having a confidence interval is useful. Another nice feature of IRT is that, as I mentioned, you can cherry pick items. You don't have to use the five that are in a given short form. You can pick items relevant to your population. I want to talk a little bit briefly about multidimensional computer adaptive testing. When multiple traits are correlated, as they often are for function, we're able to simultaneously estimate multiple traits. For example, we've developed what's called the FAMCAT to estimate basic mobility, daily activities, and applied cognition in hospital-based patients. We started with the AMPAC bank, but because these things are very correlated, we developed a multidimensional test to simultaneously estimate them. So for example, if I ask you, how much difficulty do you have bending over to pick up a piece of clothing without holding on to anything? I learned a little bit about basic mobility, your function, and daily activities, not a lot about applied cognition. Then I ask, can you use a remote control? And I learned quite a bit more about applied cognition and more about daily activities. And can you get up from a low soft couch? So what this is trying to do with each item, the algorithm, the CAT algorithm is basing item selection, not on one domain, but on three domains. It's trying to find a sweet spot where for these three domains, the lowest number of items we can ask while getting to a critical level of precision. Additional benefits, you can use, they're computerized. So you can use branching cascading logic. You can embed these computerized assessments in the electronic health record, the scores to drive clinical decision support, to automate follow-up assessments. You can leverage collateral test taking information. So we demonstrated that if you capture response times, that actually can be used to speed up and facilitate estimation of cognitive functioning. And there's another functionality called the adaptive measure of change, where if a test knows the items that were administered during a prior computer adaptive testing session, it can use that prior insight to radically shorten the duration of subsequent assessments. We estimated this to be over 50% reduction among hospitalized patients. So quite useful. Very briefly, what is PROMIS? These are a collection of item banks used to evaluate and monitor physical, mental, social health. It's condition agnostic. It's all using or calibrated using state-of-the-art psychometric methods, e.g. IRT, and translations are available in Spanish. So PROMIS offers you, I would encourage you if you're interested to go to the health measures website and look around. It's a very rich site, offers short forms, computer adaptive tests, as well as suggested profiles, which were collections of multiple domains to assess populations. And this gives you an idea of some of the domains in physical, mental, and social health. So which measures should I use? I'll try and make this quick. We'll echo Sarah's words. It really depends on why you're collecting the data. What's the context? How much respondent's burden will be tolerated? Do you need easy scoring because a computer isn't available and busy clinicians are using it? Is it easy to interpret? Does it perform well in your target populations? Is it available in your EHR? Is it proprietary? Do you have to pay to use it? And can you equate with other measures? The Prozetta Stone website up here, this is a crosswalking site where you can interconvert PROMIS scores with legacy measure scores. So are there crosswalks estimated? So if you're looking for rehab referrals to try and identify patients who can help, logistically, you want to think of very quick items, simple scoring, very discriminative at the cut point, and here are some candidate tools. If you're triaging post-rehab referral to a physician or a therapist, here you can certainly combine with other scales, but also needs to be quick and very discriminative at the cut point, because ultimately you have to pick a cut score, presumably above or below which it goes to a physician, and here are some other items. Individualizing treatment. This is where breadth of assessment, how many domains do you need to assess to individualize your treatment, plays at odds with precision, because you very quickly hit respondent burden limitations. You want to make sure that you're screening across the trait relevant to your clinic, and here are some candidate tools. And last, if you're trying to assess the response to your treatment, many of the same logistic concerns hold true, but the key psychometric requirement is a responsive item, or sorry, a responsive instrument. So just to close, response rates are everything in PROMS. In real estate, they say location, location, location. Well, in PROMS, it's response rates. If patients don't answer your questionnaire, you have no data. And so evidence-based strategies to boost response rates, salience. Patients need to feel that this is relevant to them, to things they care about. It should be brief. Clinicians need to use and acknowledge the results. Number one reason patients fail to continue reporting is because clinicians never mention it. And even automated acknowledgements are helpful. And important to note that now with EHR-based assessments, we're able to assign these longitudinally, but what we find is that response rates fall off precipitously with remote assessments. So not hard to get over 90% response rates at point of care and counterlinked. Very hard actually to get higher than 30% in remote response assessment. So I thank you very much for your kind attention and hope this was helpful. All right. Thank you, Andrea. That was outstanding. I'm going to try to bring us home here talking about sort of a little bit of a pragmatic approach to patient-reported outcome measures in the EHR. So that was a perfect segue. Why it matters and sort of how, you know, it's best that we know. So when this is going to be in person, oh, and I have no disclosures and that'll come back later, but when this is going to be in person, I was going to have some patient-reported outcomes that we could, you know, fill out on our phones or computers, you know, for the sake of being virtual. I'm just going to scroll through an example of a PROMIS item set. This is the physical function for adults using mobility aids. So this is some, you know, this is a outcome measure you might use in a spinal cord injury clinic, neuro rehab clinic, you know, where you see a lot of geriatric folks, that types of things. So I want you to kind of look at the screen, read this over and, you know, write down on a piece of paper, your answers to all this, you know, wait. There's this. Okay. Sure. Everyone's done, right? So imagine you're a patient in a waiting room. You're done. Okay. Now you've got this page. So maybe fill that out. Okay. And sure you're all done by now. There's more. Okay. You get it. This is all that same one. And I'm being a little bit, I'm being a little bit deceptive in that, that, that was a computer adaptive test bank. But the, the point being is that to get to the computer adaptive test, which is a, a form of an electronic PRO or EPRO is a way as Andrea really, I thought beautifully pointed out the way to condense a lot of information into just a few questions and give you a reliable score. And so now if you were to give that test on paper and you, you wouldn't, you would give a different short form, but it would still be long. It would have taken a really long time and patients would have thrown it in the trash before seeing you. But if they had a way to do that electronically and only had to answer six, seven questions, and you got a really good answer from it, it would be useful. And so the better way to do this is the electronic patient reported outcome when possible, because it reduces respondent burden, and it has a lot of direct benefits for patients. When patients see this electronically, not just the cat, but in, you know, just a short form, but they're filling out on their computer before their virtual visit with you, or just before their visit the next day with you, or they're filling it out in tablet in the waiting room. It's been shown to help patients remember to ask about symptoms later with their doctors, which has implications for quality of care. And patients when asked overwhelmingly recommended an electronic PRO system, but just filling out paper and pencil. And part of that probably has to do with what Andrea also talked about with the sense of nothing's going to happen with this. They're just going to throw the paper in the trash, but if it's on the computer, then maybe it gets into the note. Maybe it's something that can pop up and alert the physician of what to do. So those are the sort of patient reported benefits, but this has been borne out in real life that when you have an electronic framework by which you can ask patients these questions, it's quite helpful. And so there was one study that looked at survival. This was in cancer patients, so not a rehab population, not cancer rehab. And they found that when patients had sort of an outlet to report their symptoms, and they had more contact that somebody on the other end, in this case it was a nurse, could see that the scores reached a threshold of danger. That was an earlier warning system and healthcare could be triggered much more quickly. And patients themselves were more engaged in their treatment and their symptoms. And those patients in this trial out of Memorial Sloan Kettering actually had a better survival rate by using electronic PROs. Another study in the oncology world, again, not cancer rehabilitation. This was at my institution where they looked at the numbers of a lot of patients were coming to the emergency room with nausea. They wanted to reduce that utilization. So they found the big offenders of chemotherapy and they said, okay, so if patients are receiving this regimen, they're at high likelihood of being incredibly nauseous to the point where they come to the emergency room. And so they created a patient reported outcome sort of system that can be used and patients would report if their nausea was getting worse on a scale that was validated. And on the other end of that was a pharmacist who could say, okay, did you take your anti-emetics? Here are the ones to take and sort of the order and that they would call the patient. So again, the patient knew that this was actual information that would be acted upon and it led to less emergency department visits for preventable symptoms in chemotherapy. This was nausea and it was statistically significant and it's something that we still use. So extrapolating that to rehab is not terribly difficult, but when you compare it to, I'm going to use paper as an example. This was a study I was a part of called Impact the Brain. I didn't name it. That was a medical oncologist who did. It sounds like a pretty bad PM and R trial because we deal with concussion, but it's not up to me. So, but either way, we had a few different patient reported outcomes. So we had a promise that was physical function, or I'm sorry, just general function. And then there was a symptom assessment inventory called the MEDAXI brain tumor, as well as a caregiver outcome measure that Sarah had included on one of her lists called the zero burden interview. And our response rate was terrible. And these were given out at clinic, right? So this is not like we're just mailing it to patients and hoping they mail it back. This was paper at clinic and they were, you know, ignoring it, throwing it in the trash. And we also, the whole point of the Impact the Brain trial was to have a nurse coordinator coordinating symptom management in patients. And we still had a poor, poor, poor response rate. You know, contrast that with a head and neck cancer electronic PRO monitoring for symptom study. And they found that almost 98% of patients would respond when it was at the point of care. Subsequent visits, they still had an enormously good response rate, you know, approaching a hundred percent for this, although there was a lot less visits when you get further out. What they also did with this data is they found that they could actually monitor for disease recurrence based on patient reported symptoms. So if the symptoms are getting worse, the patients had a higher chance of disease recurrence, they got screened. And now there's some programs, including ours that are using PROs almost strictly for long-term screening in patients. Obviously, they're just out of treatment, they wouldn't be used. But that's the power of the patient reported outcome that now cancer care and survival is going to depend on it. And so it depends on implementing this well, and that's where electronic PROs come in. So, you know, one of the questions that I get asked is, do you get the same answers if they fill it out on paper? Is there something different about the computer, they're paying attention to it more or less? And the answer seems to be no. There's been studies about mode effects of paper versus electronic. A few meta-analyses found that there was no difference in terms of the actual score you get, but you do probably get the score more frequently when it's integrated into the electronic health record. And that has some major implications for healthcare quality. And so payers are now taking note of this, right? So we know that we're living in a shifting world of quality over quantity reimbursement. And as both Dr. Park and Shaville pointed out, PROs are cheap, right? They're compared to if they want to get them in for a lab, you know, or get them seen in person, that's expensive, but they can, they can answer something online, right? So they're easy to obtain and they're going to be used, right? They're also multifaceted. So if you're doing intervention for knee osteoarthritis, you could get some information about, did your intervention improve their depression or anxiety, right? Did it improve the mood? And so this is, this is where everything's sort of shifting. And if you want to get the highest reimbursement or show the highest value to your care, patient reported outcomes are integral to the bigger framework. You know, I, I do need to talk about some considerations for this. I'm founding all, you know, roses about electronic health records. It's not, you know, it's, I think it's great, but I also am under the assumption that a lot of the people listening right now don't have a lot of EPROs integrated into their clinic seamlessly. And there's a reason for that. That's, that's the downside to this, you know, and the biggest is that there is a high startup burden to this, right? So you need software and you need software that can run essentially your outcome that you want. So if you want to ask, you know, the, the ODI for low back pain, that needs to somehow be integrated into whatever, you know, electronic health record you use. And so that, you know, might take some money. You might need some tablets to have people fill that out in the waiting room or at least have the infrastructure to push it to patients. And that requires that your patients have that technology and be savvy enough to use it. You need IP bandwidth to do this. If your institution is anything like mine, there are IT folks are hard to come by because of the past year and a half, two years, and they're still up to their neck and a lot of more urgent issues that have to do with COVID, et cetera. So getting a rehabilitation PRO integrated could take some time and you got to be really patient with it. So hopefully that changes. But you also want to coordinate what you're doing, right? Not, you know, not necessarily with other, you know, providers outside of what your practice is, but in terms of at your own practice, you need to coordinate patients, getting the patient reported outcome and make sure you're not doing the same thing as say a surgeon you work with is also doing. And I need to reiterate, I have no conflict of interest. I don't care which EMR you use. I don't know which one is the best. And this was a picture off of Google. I didn't make it. So if I forgot one, I'm sorry. So, you know, in, in looking at the patient burden of PROs, the longer the PRO, the less they fill it out, right? There's, this is not surprising, but there's evidence to support that. So you want something brief, but actionable, right? Patients hate if it's too long. They also don't like as, as Andrea correctly pointed out, they don't like it. They don't think it's going to be used, right? I've filled this out a million times and nobody's asked me about this. And I think we sort of get burned by review of systems and other things. We make patients fill out, patients will check boxes and we never asked them about, you know, their, I don't know if they're incontinence or their sleep or whatever, but they're indicating that they have a problem with it. I think that's sort of lost some faith that patients have in, in filling things out. And I hope that that changes. But you need to make it known to the patient that you're actually acting on this. And those, the studies I showed you previously with the high response rate, that was sort of a central tenant to that. And, and going back to what, what Sarah had said, there are also issues with tech and, and with certainly with non-English speakers. And you have to make sure that your PR is, is, is useful for, for folks who don't speak English or English is not their first language. So that, that's sort of the, the why and some considerations. And I'll, I'll close before time for questions with the how. And I'll say that there's not a unified theory about how to get this in. I'm going to kind of give you some considerations to take home. And if this generates discussion in the chat, that's all the better. There's no, to my knowledge, clear, you know, guidance on what you should do to put these in your electronic health record. But I do think it's going to get easier as the years go on. And that systems are, are taking note of the, if for no other reason than the payers are sort of pushing it. So this is the, you know, plan to check act. I took out check for the sake of brevity, you know, but we're going to go through the planning, the doing, and the acting on the prompts and what this, what this looks like in the EPRO world. So for the plan section, it sounds really superfluous, but what do you want to know, right? It's, it's, you, you should be thinking about this, right? You should be thinking about are you looking at function? Are you looking at pain, cognition, fatigue? Those are all things that I might address in my clinic. And those are all things that might be addressed in an STI clinic or musculoskeletal clinic, you name it, you know, do you want to know about bowel and bladder or that sort of thing? And then sort of see what's out there and, and figure out that this is, these are the measures I want to use. And not only that, but you want to know, you know, what's already existing in your, your institution, EHR that maybe you can pluck that you don't have to have built and take several months. But you also want to know things like what exists in terms of, are there scores that you should look for? So if you find a measure, there might be something called a minimal clinically important difference. So if they score two points more, that means, okay, you, you know, they're two points better or worse, but if it's a change of one, maybe it doesn't actually matter. And that's a normal sort of variant, variant, you know, those are the things that you want to know as you're thinking about what you're going to incorporate. And you also kind of want to know what others are collecting. Remember patients don't like redundancy. I wouldn't like it. So if the neurosurgeon down the hall is asking the same, you know, disability index as you, that's, that's duplicating work. That's redundant. Patients aren't going to have faith that this is actually helpful. So don't repeat, you know, maybe try and compliment and know, know what others are doing. On the do side of things, you know, you, you want to actually get to get your answers in. Right. So you want to be able to know, you know, pushing the patients prior to clinic is important. So if you can do this electronically, that's great. A lot of patients are, you know, registering through, you know, the, the system, the framework that's in place, depending on your EHR and, you know, filling out, yes, these are my meds. Yes. I still have the same allergies, blah, blah, blah. This is my pharmacy, but they can also fill out these questions. Okay. If they can't do that, or if they're not doing that, then try for allowing for time to complete this in the waiting room. Yeah. If you can get tablets or if there's a kiosk that they can fill out, that's great. If they can fill it out on paper right there, and then you translate it to the electronic record later, that's sort of second best, but better than nothing. If your MAs have time, if it's not a terribly busy day or clinic, maybe they can help a patient fill it out if they're rooming them. And finally, I, you know, incorporating into virtual visits is a slam dunk because patients have to log on through some kind of infrastructure where they, you know, typically will go through, this is my pharmacy, this is everything up to date. And you can say, got to fill these out before the visit can start. So my, my virtual visits were about, which are about 20% of my outpatient volume are, I have a hundred percent response rate, which is not insignificant when I, when I'm able to check scores and also looking at bigger picture items. Acting on this means you got to do something with the data. I've kind of talked about this, but ways to do that are, you know, get that to auto-populate in your notes, you know, have, have a way that that their scores get put in there and you can actually look. On mine, I have a symptom assessment score, a bunch of zero through tens. And I also have a promise-based assessment where I get what's called a T-score, which shows how good or bad a patient's doing compared to the population mean. So if their T-score is 50, they're at the population mean. If their T-score is 75 for something, that means they're more than the population. That's by, you know, a couple standard deviations actually. So if it's physical function, that's great. That means they're a marathon runner. If it's fatigue or pain, that means they're in more, they have more fatigue or they have more pain than the population means by a lot. And I need to do something. And I, I don't have to go through every individual number this way. I'm sorry, every individual item and read it. I just have to look at a number and a T-score and say, Ooh, your fatigue is really high. Let's talk about that. And it makes it a lot easier for me. I don't have to look at a piece of paper and run through what they've checked off. I see a number and it's right there. And I think it's really reinforced some faith in my patients that this stuff matters. So, you know, beyond that, you can put it into a larger data set and run QI projects. And if research is your thing, you could potentially run research off of it, but you start to collect bigger and bigger numbers. And I think it's something that can be useful, especially for, you know, as we think about health systems, like a learning health system, you know, where you're taking certain data elements, it could be patient reported, it could also be things like BMI, smoking history, you know, emergency room utilization, whatever, but you kind of need that patient piece within it to drive it home. So this isn't just numbers and ones and zeros that you're reading. So there's sort of a moral case for this. And I think that if you're going to be running a lot of QI work, patient reported outcomes are essential and putting them in the electronic record is, you know, I think one of the best ways to do it. So with that, we have some time for questions for any of us. I'm going to stop my screen share and open this up and open up the chat. So, and if nobody has questions, I, you know, we could certainly talk about some important topics with this, but any questions, please feel free to put in the chat. I'd just like to thank everybody for coming. We've drunk the Kool-Aid and Phil, you should too, but, but really appreciate your, this isn't always the most scintillating topic. So very, very much appreciate your time and attention. Especially on a Saturday afternoon. Not in Nashville too, unless someone here is from Vanderbilt. You know, I, I wonder, you know, Andrea, you, you mentioned, you know, here are some of the, you know, what do you choose? You know, that's, that's an incredibly common question. You know, what do you tell somebody, you know, are there certain kind of go-tos or fallbacks that you recommend that cover a broad scope of our field or ones that you found have been more helpful? And Sarah too, if you're using certain ones in your practice. Um, you know, I guess, so one of the things that I think is absolutely critical for us as in physiatry is to start speaking a language of function that other disciplines understand. I think in many, we use different instruments. We don't have kind of shared taxonomies and vocabularies. And, and this is a problem because it means that, that it's harder to help other disciplines understand what we do and what we can do for their patients. And so one of the things I've increasingly, um, uh, recognizing that the perfect prom does not exist yet, um, try to find things that will resonate and be easily accessible to other disciplines. And so we're, we're using promise. I, I, I think, uh, a lot of, we actually use just a numerical rating scale in, in the oncology clinics for screening. And then we have used branching logic and ethic. If someone has, has a problematic level of a symptom, we then can branch to, to a domain specific multi-item instrument. Uh, but, but it's just one of many attempts to keep it simple and, and think about your audience, our, our, our, our stakeholders, our patients, keep it simple, relevant for them and, and other disciplines. I'll make sure they understand what the data means and why we're collecting it, Sarah. I actually just wanted to echo that. I really, I think about these as kind of like the vital signs of rehabilitation medicine or lab values, because we, um, you know, we, we treat based on patients, um, patient reported everything, um, cause we're hoping to improve their quality of life and function. And that is really, um, you know, subjective in a lot of ways. And so in order to, I think, advance our field, um, from a research standpoint and from, you know, uh, an interdisciplinary clinical standpoint, um, you're right that we need a common, we need an accessible form of measurement to be able to track that what we're doing is making an impact and a reliable way to do so just as other disciplines have, you know, uh, like I mentioned, vital signs or lab values to show that they're improving things for us. This is really patient reported outcomes in my opinion. You know, just to add one thought, we've got some questions in the chat, but one of the things that I've only latently appreciated is that other disciplines have to sell us to their patients. There aren't a whole lot of patients that say, oh, great. You know, I, yay, I want to go to rehab. And, and the more we can empower other providers to demonstrate for the patient, look, you're measuring, you know, based on these measurements, it sure looks like other patients like you, who I've sent to rehab are helped. And so that again, that common constructs, uh, making it resonate with, with other providers, very important. Sean, do you want to take some of the, the chat questions? Yeah. Yeah. Um, the first one was from Will Carter. He asked, uh, what is the ease of modifying after choosing certain questions? Um, so if you're, well, yeah, I guess I don't, I, maybe it's, um, for, for promise and, and basically promise, you can take certain questions and you can put five of them together and make your own questionnaire, essentially, you know, it's called a short form. Um, and you could, if you find that one, isn't very helpful, you can take it out and put in another, but, um, it, I don't know if you meant like with a, with a computer adaptive test, you know, if you need to correct, you could, you know, the patient I suppose would, would answer the next question accurately. Um, but you can't sort of mix and match some of those older legacy items. So you'd have to, you know, uh, kind of either go all or none with something like that. You can program the cat to do whatever you want. So, so if, if you can program the cat so that it can back up, uh, sometimes people put in a no backup, uh, parameter for time's sake, but, but because it's computerized, if that was a question, you can, you can do whatever makes sense for your population. We also got asked, um, is there a single functional measure going from most basic like bed mobility to a higher highest one, like playing a piano concert? Um, you know, I, I would, a cat can, can do that, can accomplish that. I think would probably be the best way. Um, or if you wanted to make a physical function type of short form with a promise, I think that would probably be the way to go. That's exactly what a cat is designed to do. If, if, if you tell me you can walk a mile without stopping to rest, I'm not going to ask you about going up a stair or, and I probably don't even need to ask you about transferring out of a chair. So it's, um, that's what it's designed to do. And it can, if the item bank is large enough, uh, for a short form, I, you'd need to include all those items, the full tree, spanning the full trait on, on the instrument, but it would mean that you'd have less precise measurement for each individual. Yeah. Yeah. Um, said, do you use promise data to demonstrate value to your administration slash leadership, um, you know, payers, or is it for clinical use? And then the follow-up question, I think very much related, uh, is there a visual way to sort of track the PRO, um, scores within the EHR? Um, I can speak from, from what I do. We're probably, I don't know, four or five months into using this. So I haven't used big data yet to say, this is what we have accomplished and look at how good our patients got when they saw us just based on this, but it's a sort of another notch in our belt, so to speak. And the promise, um, sort of the, the promise tool I use has been shown that when patients get, you know, if they actually get better, the scores get better. So it's responsive. And we know that that's going to, um, show up. So if we, we find that our patients are getting better seeing us, that'll help. And you can graph it and have a, you can have a display, like here are the last three readings, you know, that the patient filled out or five, you can customize it like that with, with work with your IT. Have either of you used this in practice like that? Synopsis is, is the epic functionality that typically that foundation systems will allow you to, um, look longitudinally and track scores. Uh, they, they, you can also, then timeline is a newer functionality and epic that lets you temporarily cross-reference with referrals to specialists, starting medications, surgeries, procedures. So it's kind of cool because you can see, oh, did that, that epidural injection or that knee injection, what was the, it's, it's impact on, on domains over time. So yeah, timeline for epic users timeline, uh, is, is a bit of a new, new, improved iteration of synopsis. It's, it was developed for inpatient, but now it's available for outpatient too. Yeah, I do find that to be very helpful, especially with looking at the correlation between medication and, um, and these, these symptom trackers. You can use that like that. No, I think I don't, I don't use it like that, but I should. It's one of, yeah, it's, you need a driver's manual. It's not the most shock, you know, yeah, you'll be shocked to discover it's not the most user-friendly functionality. But one of the interesting things with, with prom data at, at our institution was there was there, there, it was clear that, that, uh, we were able to parse out patients were doing better after lower joint replacements, especially redos when they were referred to rehabilitation, the ones that, yeah. And so that is actually, uh, influenced, um, practice. Yeah, this stuff matters. It's, it's yeah. And it's only going to matter more. So we're at time. If anyone, you know, our email addresses are, are up, please feel free to reach out. Thank you everybody for listening, uh, especially to my, my co-presenters, Dr. Park and Shaville. It's a pleasure as always, and I hope everybody has had a good virtual conference and enjoys the rest of it. So thank you all. Thank you guys.
Video Summary
The video discusses the importance of patient-reported outcome measures (PROMs) and their integration into electronic health records (EHRs). PROMs reflect the impact of healthcare services on a patient's health status, and they are important in assessing the effectiveness of interventions and aiding in clinical decision-making. The video emphasizes the benefits of collecting PROMs electronically, such as reducing respondent burden, improving data accuracy, and enabling real-time monitoring of symptoms. Integrating PROMs into EHRs allows for better coordination of care and improved patient engagement. The use of computer adaptive testing (CAT) in PROMs is highlighted as a way to condense the number of questions and provide reliable scores. The video also discusses the challenges of implementing PROMs in EHRs, such as the initial start-up burden, technological considerations, and the need to coordinate with other providers. Overall, the use of PROMs in EHRs is seen as valuable for improving the quality and value of healthcare services, and for empowering patients to actively participate in their care.
Keywords
PROMs
EHRs
healthcare services
health status
interventions
clinical decision-making
symptoms
patient engagement
CAT
patient empowerment
×
Please select your language
1
English