Scott Marion, Center for Assessment
Scott Marion spoke to Emily Freitag about assessment strategies in the aftermath of disrupted schooling. Scott Marion is the executive director of the Center for Assessment, an organization that partners with state and district leaders to design, implement, and evaluate assessment and accountability policies and programs.
Watch the full conversation or read the abridged Q&A below.
EF: What are you hearing from school leaders and thinking about when it comes to assessment?
SM: Many of our assessment worries are not new. At my organization, the Center for Assessment, we talk about assessment literacy. It’s a never-ending battle. And we haven’t made enough progress. If we had been in a great position with assessment literacy as a field before this, we’d be able to manage what we have to face next year a lot better. And what I mean by that is really high-quality formative assessment processes that are literally inseparable from instruction that you wouldn’t know from a good instructional moment or scaffolding opportunity or just high-quality feedback. Most teachers are not very skilled at that. It’s not their fault. They don’t have the practice and don’t have the training.
What we see is an over-reliance on commercial products that are used because we don’t trust teachers’ judgment about what kids know and are able to do. So districts employ things labeled “interim assessments,” and those are often administered two to three times a year with some purported instructional use. I don’t understand how it could be instructionally useful if it only happens two or three times a year, because you just can’t group the kids in the fall, and then leave them in those groups until the winter.
We’re facing a challenge. In some ways, it’s like New Orleans. They’re below sea level and the oil industry has destroyed the marshes, so when a hurricane comes, they’re so much more susceptible to flooding. So if the seagrasses hadn’t been attacked, if the city was built a little differently, they’d be able to withstand hurricanes a lot better, right? And so COVID is like our hurricane in a way. And it makes it that much harder for us to withstand this, because there is this pressure coming from lots of folks, and I think everybody is well-meaning, but it’s this pressure that we need these diagnostic assessments in the fall. And they’re talking about them like their state should give them or the district should give them.
For me, when I think about a diagnosis, it should be accompanied by a prescription. Otherwise, it’s just, “You’ve got two weeks to live, good luck.” That’d be a bad diagnosis, because there is no treatment. But in most cases, you say, “You know what, you have high blood pressure. You could try this in terms of diet or this in terms of exercise, or this in terms of medicine to lower your blood pressure.” It’s specific to you, depending on your age, your weight, and how high your blood pressure is. But if I just say, “We’re giving everybody blood pressure medication,” that doesn’t make any sense. Our prescriptions need to fit our diagnosis. And in almost all cases, our diagnoses should be specific to individuals, or at least small groups of individuals, to small groups of kids. And that’s what we see as the de-skilling of teachers as we say, “You don’t know enough to figure out what kids are ready to learn. So we’re going to make you give this test.”
EF: Is there an appropriate use for diagnostics in education, said in the context of diagnosing for dyslexia? And what do those assessments actually look like? What does it entail to diagnose fully and why are interims not diagnostic?
SM: In most cases in our field, it’s diagnosing for certain specific learning disabilities in case of special education. You see it with early literacy a lot. You have these close-up assessments to see why a kid is either struggling to read or how well they’re reading. And it’s almost always one-on-one, and if it’s not one-on-one, it might be a screening assessment that then leads to a one-on-one assessment.
I think the way it’s being approached now is the illness, if you will, is learning loss. So we want to somehow gauge how much kids have lost. But we’re not measuring that in any kind of specific way. I don’t know if you’ve seen the NWEA paper that says kids will come into—it doesn’t even make sense on the face of it—but it says kids will come into the year having learned 37-50% of what they would have learned in a normal year. It’s like, wait, we just had them there in school for three quarters of the school year. So you’re saying they learn two-thirds of what they’re supposed to learn in the last nine or 10 weeks of school? I don’t think so.
The point is that somehow I know that you learned less. I don’t know how much less. I just know you learned less. And now, I’m just going to make sure you learn more. But what do you do? Do you just teach harder, teach faster? It doesn’t tell you anything content-specific. And that’s the key thing, which is what we’re pushing with our friends at Student Achievement Partners and CCSSO and others, is that I need to know: For Emily, what does she need to know to start this first major instructional unit of the year? I don’t have to go back and reteach all of the fourth grade to make sure she’s ready to move forward into fifth grade. I need to know how well she’s able to deal with fractions, adding and subtracting fractions, and perhaps multiplying some fractions. I need to know that and as I move into the first unit, I’ll find out. I need to figure out a way to supplement some of Emily’s learning from what she would have learned last year, or typically what she might have just lost in the summer anyway to do that.
The reason why I focus on the first unit or the first and second unit, is because if I go and give you an assessment of all of fourth grade at the beginning of fifth grade, there’s going to be a lot you’ve forgotten, right? But, once you get in the rhythm of learning in fifth grade, I’m finding out stuff that you needed to know. So, by the time we get through the first couple of units, the stuff that I might have thought that you were missing by the time I get to a third unit you already picked up in the first two units. And so that’s why these larger-scale things to me just don’t make sense.
EF: Particularly if the resulting action actually then damages identity or learning opportunities.
SM: Right, if you say, “You know, instead of being in fifth grade, you’re in fourth and a half. We’re going to do a special track.” That’d be a disaster.
EF: I do think predictive assessments are something leaders crave, and I understand why they crave them so that they can know which teachers need more support. Is there any role using predictive assessments should play? Is there a role that you think two months before the state test, let’s do a full-length practice test, or do you think even that is damaging?
SM: It’s not damaging. It’s just a waste of time. First of all, everybody talks about these predictive tests. All these tests correlate within the subject, and within kids pretty highly. If I give any flavor of interim assessment, or even a decent classroom assessment in fourth grade, I could have a decent shot at predicting the end of year fourth-grade test. It’s all based on correlation.
EF: So your instructional assessments can also just be predictive.
SM: Absolutely. It’s like how high school GPA is just as good a predictor of first GPA in college as the ACT or SAT. So for the predictive stuff, when state chiefs and others say, “I want to do an assessment in the fall on a large scale to see where kids are at and see how much they lost,” I say, “Do you have resources to direct differently if you find that the kids are much lower here than over there? If you don’t, then it’s just an academic exercise.”
EF: Right. “Interesting” is not actionable.
Let’s go into the test prep notion. I do think there’s a strong sense of “I want kids to practice with the format. I want kids to like to understand how it feels.” Is there validity in those instincts?
SM: If you want kids to do well on the state test, especially for the younger kids who haven’t seen the format, if it requires kids to manipulate certain things like grid response answers and things like that’s not part of your regular instructional program… then sure, practice. But, it requires way less practice than people think. My advisor in grad school, still a good friend, Laurie Shepard, talks about this thousand mini-lessons problem. If I give this test to 50 kids, with multiple sections, and it’s a 50 item test. You imagine this matrix, a 50×50. And now I’m going to remediate every kid on every item they miss? That’s not a good instructional program. So there is something to be said with just learning the format of the test. It removes what we call “construct irrelevant variants.” If you’re not doing well on a test because you don’t know how to deal with a gridded response, it’s a low hanging fruit.
EF: It feels like we spend like 80% of our energy on that as opposed to 80% of our energy on content mastery. Knowing grid responses, but not knowing the answers isn’t helpful.
SM: That’s right. You’ll know the answer to that question you practiced only if it’s then used to generalize to the larger concept.
EF: What you do think an appropriate assessment response to the interrupted schooling could look like?
SM: I think parsimony is going to be the key, it’s going to be demanded. And I’m okay with a relatively light touch interim and I actually prefer the ones that are tied to the state assessment system. A lot of states have their own interims that are connected to the state assessment scale. It gives you a different kind of look and you could actually connect that to some levels of proficiency or things like that. But by “very light touch,” I’m taking one class period for ELA and one class period for math. That will give you an overall picture of how big the gaps are, in general, how much less kids have learned. But that should be it.
I think where people should be spending the resources—and this is a challenge, because it’s a resource challenge. If you do have a high-quality curriculum, it likely comes with some sort of assessment support, right? And so then you employ those unit assessments. If you don’t have a high-quality curriculum, then in some ways, you have a bigger problem than assessment could solve. It sounds funny that the guy from the Center for Assessment is saying this is not an assessment problem, but it’s not. It’s a curriculum and instructional and school organization problem that needs to be solved.
I’ve been pushing this notion of hypothesis testing because we can’t wait until September 1 to say, “Oh, crap, Emily doesn’t know anything.” I have to think about now, based on the information I have from this spring, however limited it is, how many kids were engaged, how many kids completely fell off the radar. If you’re collecting anything that looks like attendance, how well they attended. Then you can use that information to start making predictions. Let’s say NWEA is right, let’s say my kids are going to come in a third further behind than they would be, that our achievement gaps have increased by two-tenths of a standard deviation. Then what am I going to do? How am I going to organize my school differently? Am I going to bring in more power professionals? Am I going to create remediation blocks for certain kids? Am I going to make it seven math classes a week instead of five?
EF: So it’s like scenario planning for your classrooms. I’ve heard a lot about scenario planning from the context of the virus and the impact on schools. What I’m hearing you describe is planning for student readiness scenarios.
SM: Right, because you should actually have some a priori ideas how the assessment results are going to come out. That would help with efficiency instead of just saying, “All right, we got these assessment results, now what?” When it’s too late because the school year has started. This planning has to be going on immediately. We’re already late.
So if I had my way, we’re doing hypotheses, and that that very light touch, large scales. And if the state is going to do it the district shouldn’t do one too. Because then it’s just too much. And if the state makes theirs optional, the district should think carefully about what information they care about because there are two things you could look at. If I’ve been using NWEA for 10 years and I give it in the fall, relative to where my kids were the past 10 years in the fall, I could see how much further back they are now, how much lower they’re performing. If I’m using a state one that’s linked to the summative assessment, I could see here’s how my kids are performing relative to state proficiency. So those are two different questions. That’s a value decision and a utility decision, as you’re thinking about what you’re going to use. But they shouldn’t do both. That’s very clear. Very parsimonious.
EF: My four-year-old talks about “the test” all the time right now, and he is not talking about school tests. He is talking about the virus test and he is kind of obsessed with the test and the idea of the test. Even the trauma that I think, “Hey, kids, we’re going to take a test today” might cause is so real.
SM: I would focus as much energy as I can on supporting teachers. Whether it’s through the stuff SAP is coming out with on priority content, determining: What’s going to be your first unit of the year? What are you going to do if you have to deliver it in a hybrid situation?
EF: As a general rule, if you do have a high-quality curriculum, would you just start with the first unit as planned for that grade? What is your sense of remediation units?
SM: I wouldn’t do remediation. Now if the first unit you’d planned is not on the high priority list of content—and it’s not just SAP, TNTP, is doing the same thing, and hopefully they’re aligned— I would start with whatever is the unit first that touches that high-priority.
EF: So you would go forward, you wouldn’t try to hypothesize what they missed last year that you’d have to cover.
SM: I don’t think you could do it well, especially with the variability. We could predict certain socio-economic and racial gaps that have been seen, but we’re going to now have gaps with dual-parent working families, you’re going to have even those differences. And then there are the kids who checked out for a whole variety of reasons, especially the older kids. Folks are saying, and I believe this, K–2 literacy is suffering in a bad way. You need to be able to sit and read with kids.
EF: Let’s stay on K–2 for one more minute. Interventions, like policy frameworks for RTI, MTSS, often have an assessment component to them. Is there any insight you can offer on how those universal screeners or progress monitoring tools should factor in?
SM: I think they have a certain purpose now to screen someone for MTSS. And so you should continue to use those. And I was just reading something before we got on the call that some are worried, and rightfully so, that we might get a whole bunch more kids screened for special ed identity. And that would clearly be a mistake, because they didn’t all of a sudden become cognitively disabled in some way or develop a learning disability. They’re just further behind. You don’t want to see more kids, especially brown and black kids, identified for special ed. So interventions should be used as they would normally be used. They’re not going to magically be able to do something that they haven’t been able to do before.
If all teachers had high-quality formative assessment repertoire, we wouldn’t need anything else. But they don’t and they’re going to need support and that’s the other place it’s where it’s tricky. It’s on the school organization. It can’t be up to every teacher to figure this out for themself. Districts need to be pulling teachers in the summer to be the leaders of their grade-level to be able to support other teachers in the school. And whether that’s developing assessments if they don’t have the assessments with the curriculum, or if they do have assessments with the curriculum, they don’t need to give every one, so who is going to decide which one to give?
EF: What does it look like to acknowledge that teachers need assessment support, and yet not deliver that in the form of a product?
SM: That is tricky because that’s the easy answer. I’ve had a superintendent say to me, “I can’t afford all that professional development for this program we’re running, but I can buy iReady for a lot less money.” And I said, “Did you just hear the words you said?” That is the struggle because people will think, “Well, I can’t provide this professional development so I’m going to give them this test.” If it’s not going to tell you anything, you might as well put everything you can into supporting teachers. In this time of budget cuts, we’re asking a lot, but this is the commitment it’s going to take because schools are going to need this kind of support. I work in Chicago, I work in other places, you know, and I understand this is a tough ask. We have to be able to do this and it’s harder in these big urban districts. But I don’t see any other way. To Chicago’s credit, they did take on this massive curriculum reform project. I just wish they had been into it for another two years previous.
EF: I appreciate this conversation. Here are my takeaways: Be very wary of diagnostics. Be generally parsimonious about assessments. Be skeptical if the impulse to buy an interim is just based on “I want to help my teachers” because it may feel like an easy solution, but it actually may be a waste. It has to link to instruction—it’s really a curriculum centered assessment strategy. And it’s all really hard.
SM: You got it.