A framework for evaluating teaching performance

Assumptions / Statements:

This document will undergo mandatory review after the first merit period in which this framework is used for faculty excellence.
It is presumed that activities detailed in this document are carried out in good faith.
Good teaching is a multi-dimensional activity. As such, it is important for us to consider more than just the student evaluation of instruction (SEI).
It is important to reward the process of pushing the “teaching” frontier. This means innovation and course development have as much relevance to being a good teacher as other numerical measures like SEI, etc.
In as much as innovation and course development should be encouraged, it is important that teachers who currently are successful (by whatever measure we designate) are not penalized for not being innovative. That is, we reward innovation, and not penalize lack of it.
Any framework to evaluate teaching has to be tailored to an individual’s objective. This means, among other things, that the “relevant” evaluation measures for each of us may be different.

Measurement Metrics

This section concerns primarily with ways in which we can objectively (not necessarily numerically) measure teaching effectiveness. Traditionally, we have used student evaluation of teaching as our primary measure. I suggest a broader set of metrics. In evaluating the metrics below, we need to keep in mind that the relative emphasis depends on the objective of the teacher. For example, if I were teaching a case-based course, my objective may be to teach students to ask critical questions. On the other hand, in a statistics course, I might worry more about the interpretation of the output. In any case, the idea is that any measure should be tailored towards the objective (of course, one would presumably state the objective before the course started).

1. Student Evaluation of Instruction (50 points)

a. In line with my argument that any measure should synchronize with the objective, I propose that we use more than Items 21 and 22 here. I suggest that each faculty member identify items that are relevant to their objective. A composite score can then be formed from the items identified. In this category, item 21 will always be included in the composite score, but the faculty member can choose up to 5 additional items to form the composite score. The composite score will be the average of the items chosen.

b. Use of an absolute scale, rather than against the “norm.” A value of 2.00 in our evaluations indicates that we are a “very good” teacher, regardless of the norm. Like assigning grades, we should reward the score, rather than compete among us. I suggest the following breakdown of points:

i. 1.00—2.25: 50 points

ii. 2.26—3.50: 30 points

iii. 3.51—4:75: 15 points

iv. 4.75—6:00: 0 points

c. In collecting evaluations, we must consider the validity of the sample information. My initial reaction was to say that we would require a minimum percentage of responses before we consider the evaluation. But, what if sizable portions of the students never show up for class? Are their evaluations relevant? Maybe in a case / discussion course, we may want to consider comments only from students who were actively involved in the class. As teachers, most of us are looking for information that will help us improve. As such, identifying the relevant student body is a non-trivial task. As a first step, I propose the following:

i. Ensure that at least 60% of those enrolled turn their evaluations in, or

ii. Explain how the evaluations collected are indeed relevant. It is quite possible that this strategy can lead to a more useful measure. For example, in some classes, it may be more relevant to take SEI several times a semester. For such situations, a smaller random sample may make more sense. In any case, ….

d. In the event that SEI are not available (as can happen when there are less than 6 students in a class), the following options will be considered:

2. Evaluations from former students and their employers (15 points): While I actually consider this to be a more relevant measure than SEI, the lack of a good measure to capture this information forces me to give less weight to this. As such, until a good and systematic way is developed to capture this information, I suggest we use things like comments from students, employers, etc., as relevant measures, even though the responses may be biased.

3. Peer Evaluation (15 points): As important in my mind is peer evaluation. I suggest that we each identify the type of peer evaluation we are comfortable with. For example, in my beginning undergraduate course, one of my concerns may be the initial step I take to establish the concepts. In which case, I might want a colleague from a totally different area to evaluate my teaching. On the other hand, for a more advanced course, we might want peers who are in our area. As there is no systematic data collection approach for this yet, I suggest that just the act of peer evaluation be rewarded at this point. In addition, the faculty doing the peer review will get 15 points for each course reviewed.

4. Other measures. None identified so far (March 25, 2001).

Course Development

We sometimes spend a significant amount of time making changes to our courses, not to forget developing new courses. Any such changes are likely to have an impact on our SEI, or any other measure we use. As such, a reward system that rewards only the end product without considering the process is, I believe, short sighted. Under this section, the following teaching/course related activities would be rewarded:

1. Courses developed (25 points)

2. New course preparation (15 points).

3. Major revision of an existing course (10 points)

Innovation

Innovative and different approaches to teaching should be cherished and rewarded, regardless of the actual measure used. Some examples in this category include:

1. Changing teaching formats (15 points): For example, going from a lecture setting to one of cases.

2. Using different methods of delivery (15 points)

3. New testing methods (10 points)

4. New measure to evaluate teaching (1000 points) J

5. Others?