by Alicia Betz
Student evaluations of teaching effectiveness (SETs) are such a common practice at the end of the semester that many people don’t even bat an eyelash at their use. But are they really that effective? Are they statistically significant? Are they biased? Are they a waste of everyone’s time? We’ll dive into those questions and more in this article.
- The SET experience from a student and teacher perspective
- What the research says about SETs
- A small SET tweak that might make a big difference
- Where to go from here
Student rating of teaching effectiveness (SRTE)… that’s what they were called at Penn State where I did my undergrad. Many colleges and universities call them SETs (student evaluations of teaching effectiveness).
Everyone knew when it was SRTE time. The professor had to leave the room and designate a student to drop off the forms at an administrator’s office when we were all done. A big portion of our professor’s job evaluation was in our hands and we knew it. At Penn State, SRTEs are used to help determine promotion and tenure decisions, and they’re also part of annual faculty reports.
We basically had two options: fill out some bubbles and get on with our lives as quickly as possible or spend some time truly thinking about the course as a whole in order to give accurate and meaningful feedback. Of course there is the third option, too: love your professor (the class was easy and he cancelled a lot)—highest rating! Hate your professor (you learned a lot, but his class took up too much of your time)—lowest rating!
Do you remember what finals week was like when you were in college? Most students have three things on their minds: passing their exams, making memories with their friends, and getting home for the holidays or for summer. Especially when asked to do it on their own time, the average college student isn’t going to put too much thought into giving thorough and accurate ratings of their professors.
My experience as a student and a teacher
As I moved on and became a high school teacher, I realized even more how flawed SETs can be. My first couple of years teaching, I gave my students my own version of an SET to fill out, just for my own purposes. I quickly realized that at the end of the year, not only do they not care about them, but I don’t care about them as much, either. I stopped giving them because they were mostly a waste of time. There might be a few students here and there who would give meaningful feedback, but for the most part I read through the evaluations once and promptly forgot about them.
What was far more valuable to me as a teacher was formative assessment. Students do have valid input about their teachers and courses, but the most effective things they have to say are about their own learning, not about the learning of next semester’s or next year’s students. They’re also more motivated to give meaningful feedback when it directly impacts them. This is so clear in secondary schools, but its truth carries over to postsecondary schools as well.
Sometimes, I can teach a lesson in first period that goes off without a hitch—the students are engaged in learning, discussing, and asking meaningful questions. Time flies by and none of us even realize the bell is about to ring. Then second period comes along, and the lesson falls flat. I get blank stares, forced engagement, and the lesson is over with 15 minutes left in the period. Nothing changed about me—it was the same lesson and the same content, but I was interacting with a different audience.
If there can be such a great change in how the same lesson resonates with students from one class period to the next, how much are student ratings at the end of the year really going to help me make things better for my students next year? Not to mention what the research says about whether SETs even collect accurate data.
What does the research say?
A nearly insurmountable amount of research has been done on SETs, and perhaps part of the reason why this has become a controversial issue is because different studies have come to vastly different conclusions on the topic.
An article from the American Association of University Professors (AAUP) explained that there are many statistical problems with SETs, such as low response rate and smaller classes being more influenced by outliers than larger classes. These outliers could pull the instructor’s overall score higher or lower, depending on the situation.
AAUP also cited studies that found students rating professors differently based on age, race, gender, their grade in the course, and how much the student liked the professor as a person.
One of the major concerns this brings up is that it potentially creates a slippery slope that is making courses easier across the country. Professors want those good SET ratings, they’re more likely to get them when students like them—and professors believe students are going to like them more if their course is easier, so they dumb down their course a little bit or are more lenient in their grading, even if subconsciously. Anecdotally, I’ve seen students show preference to “easy” teachers.
High school teachers have a unique opportunity to learn what students are really thinking, since teenagers often don’t have much of a filter. In my experience, students tend to like (and sometimes take advantage of) teachers whose courses are very easy; they tend to respect and like teachers whose courses are mildly difficult; and they tend to dislike and complain about teachers who really push them, who assign a lot of work, or whose courses are very hard. Sometimes students acknowledge that these teachers are actually really good teachers, but this is usually only after some prompting. When students allow bitterness over a grade or frustration about the amount of time spent on a course to cloud their thoughts of their teacher, it can obviously affect the SET scores.
Many studies have corroborated this phenomenon, and evaluation of the research shows that there is an incentive for professors to grade more leniently and make their courses easier in order to get better SET ratings from students. Another report for the Journal of Educational Psychology, however, found the opposite—that students are more likely to give instructors higher scores when courses are more difficult and have a heavy workload.
For almost every study releasing one finding about SETs, there seems to be another study contradicting the findings, so it’s hard to come to a conclusion about what to do with SETs.
A Summary of Research and Literature from the IDEA Center shed some more light on SETs and tried to make sense of all the research out there. Among other conclusions, they found one flaw of SETs to be the Dr. Fox effect:
“Where a professional actor, who delivered a dramatic lecture but with little meaningful content, received high ratings – suggested that student ratings might be influenced more by an instructor’s style of presentation than by the substance of the content.”
The IDEA Center also found that ratings of instructors generally differ depending on the discipline. For example, instructors of humanities courses typically get higher SETs than instructors of math courses.
So SETs have statistical problems and are wildly flawed, but despite their flaws, they do have their merits. According to AAUP, they’re widely used because they are easy and cheap to administer. They also provide data for instructors and administrators to evaluate and use to make changes.
An adjunct professor at a university in Pennsylvania, who wished to remain anonymous, stated that although the administration doesn’t spend time going over her results with her, they are tied to her evaluation. She finds them useful to gauge what students’ impressions of her are, and uses the results to change her instruction to fit the impression she wants to make.
Furthermore, SETs give students a voice regarding their education—an education they are paying a lot of money to receive. With all the conflicting viewpoints and research, what’s an administrator to do?
Formative Assessment Might be the Answer
We hear the terms formative and summative assessment thrown around all the time when we talk about assessing students, but this often isn’t part of the conversation when we’re talking about assessing faculty. As mentioned earlier, formative assessment (both formal and informal) is so much more effective in my high school classroom. As a teacher, you can often hear me say things like “What did you all think about that activity we did yesterday? Do you want to do more activities like that?” Simple, casual questions like that give me insight into how my students learn, and help me improve my instruction.
The adjunct professor also added that students should be required to elaborate. “For example, if a student selects that I was ‘rarely’ prepared for class, they should be required to elaborate. ‘In what way?’” Having this information earlier in the semester would give professors a chance to listen to what students need from them so they can improve.
A commentary by Nancy Bunge for the Chronicle of Higher Education suggests that SETs would be more helpful and effective if they were conducted throughout the semester as formative rather than summative assessment. The IDEA Center echoed a similar sentiment: “Receiving feedback about student ratings administered during the first half of the term was positively related to improving college teaching as measured by student ratings administered at the end of the term.”
Formative Assessment Resources:
With so much conflicting evidence, it’s hard to know what to do about SETs—and while the easiest solution is to not change anything, that would be a disservice to students, instructors, and higher education as a whole. The research suggests that there is a place for SETs, but that they shouldn’t hold as much weight as they do at many schools.
In most schools, SETs need to change, but whether they need to completely go or should be modified will take close study and evaluation of your own practices. Talk to your students. Talk to your instructors. Re-evaluate the role SETs play in your institution and whether they are effecting meaningful change, or if they’re simply being used as an ineffective summative assessment of instructors.
About the author:
Alicia Betz earned her bachelor’s in education from Penn State University and her master’s in education from Michigan State University, where she also earned her certificate in online teaching and learning. She is a high school English teacher as well as a professional writer specializing in education. She uses her experience in the classroom both as a teacher and a student to write actionable and authentic pieces for various educational publications. Alicia can be reached at www.saiwriting.com.