BERK'S BLOG: From the keyboard of the "Humor Professor": formative decisions

Monday, May 17, 2010

A BerksNotes® GUIDE TO STUDENT RATING SCORE INTERPRETATION: Overview

APPLICATION TO DIFFERENT FORMS
Although each of you is using a different rating form with different numbers of items and scores, those differences do not matter in score interpretation. Whether you’re using a commercial package, such as IDEA, SIR II, PICES, or CIEQ DU SOLEIL, or a “homegrown scale,” there are only so many score reporting possibilities for any form in Likert-type format. So my suggestions are generic and should be applicable to your form. I encourage you to consult the guidelines or manual for your reporting system for more specific information.

FIVE BASIC CATEGORIES OF RESULTS
There are 5 possible categories of results reported for most student rating forms:

1. anchor distribution of percentages
2. item statistics (mean and/or median)
3. subscale statistics (mean and/or median)
4. total scale statistics (mean and/or median)
5. summary of comments to open-ended questions

Your report form may not provide all of the above, but it should certainly give you at least 2 and 4.

WHAT DO FACULTY NEED?
That's a lot of information. You could use all of those results, however, 1 and 2, in particular, provide the most valuable diagnostic info to revise teaching or course materials that will benefit your next course-load of students. These are called formative decisions about teaching. Category 5 can explain the reasons for the ratings to 1 and 2.

WHAT DO ADMINISTRATORS NEED?
Summative decisions about annual contract renewal, merit pay, or promotion and tenure review by department chairs, associate deans, etc. can be based on 3 and 4 and possibly the global item scores.

This blog series will focus primarily on the faculty needs. My next blog will examine the 1st level of interpretation: ANCHOR-WORLD!

COPYRIGHT © 2010 Ronald A. Berk, LLC

Sunday, October 25, 2009

What Scores Should Be Reported from Student Ratings of Faculty?

Recently, I was involved in a spirited marathon discussion with a bunch of colleagues on technical issues related to student ratings of teaching performance. One big topic was: How do you report results for formative and summative decisions? I thought some of my bloggees might be interested in the options available. These options with report form examples appear in my Thirteen Strategies... book (see Stylus link to right).

In order to answer the question, you don't need to administer multiple rating forms. There are a lot of options with the results from just one form. It is possible to "have your cake.." with one form for both formative and summative decisions up to a point. The trick is how the results are analyzed and reported for each decision maker.

Psychometrically speaking, I recommend the following:
1. A structured scale with 4-6 subscales measuring separate constructs such as Class Organization, Teaching Methods, Evaluation Techniques, and so on. The faculty evaluation lit reports several major constructs based on factor analyses. These core teaching behaviors should be generic enough to apply to most courses and disciplines.
2. A separate section devoted to course-specific items each instructor might want to add should be included. This optional section might contain up to 10 items.
3. One to three global items may be included as well, although individual item alpha reliabilities are typically much lower than item aggregates, such as subscale or total scale scores.
4. An unstructured section containing 2-5 stimulus questions to which students can comment is also important. Loads of online administrations reveal students spent considerable time typing buckets of comments. Frequently those comments explain the responses to some of the structured item ratings. Both forms of evaluation are valuable and furnish complementary information on teaching performance.

Analysis-wise, the above structure permits results at the following levels:
1. anchor distribution of percentages
2. item statistics such a mean and median (almost all distributions are negatively skewed)
3. subscale statistics
4. total scale statistics
5. summary of comments by stimulus question

That's a lot of information. Faculty would benefit from 1-5. 1 and 2, in particular, provide valuable diagnostic info to revise teaching or course materials that will benefit their next course-load of students. It is formative feedback only in that sense. Other formative methods administered during the course should be considered. You already know about those options.
Summative decisions by department chairs, associate deans, etc. can be based on 3 and 4 and possibly the global item scores.

The above strategy is certainly not new, but it is the simplest to get the biggest bang from your student rating scale. Of course, it is only 1 of 14 sources of evidence you might use in measuring teaching performance. Multiple sources of evidence should be involved in summative (personnel) decisions about faculty contract renewal, merit pay, and promotion and tenure. After all, faculty careers are on the line.

If you're grappling with this issue, I hope these suggestions may be helpful.
COPYRIGHT © 2009 Ronald A. Berk, LLC