Evaluating a center’s programs and services: Where to focus on, what to collect, and how?

Attending conferences can be very invigorating especially those conferences that not only allow for networking but also promote active engagement of the attendees. Being an active attendee is a requirement at the POD network conference hence the acronym Participate Or Die often referenced by the elder PODers who regularly attend this yearly meeting of American educational developers.

This year I attended the POD Network conference in Pittsburgh and I participated at the two-day New Faculty Developers’ workshop which was a valuable experience as I am trying to enter the field of Educational Development. Among other things I attended sessions on evaluating a center’s programs and services (e.g., From Satisfaction to Impact: Assessing Faculty Learning and Development, C. Meixner & M. Rodgers, James Madison University), which were an inspiration for this blog post. Here I will overview a model for evaluating programs and services with respect to four intended learning outcomes and propose evaluation methods and strategies for collecting evidence and data.

The Four Level Evaluation Model


One of the most adopted models for evaluating training and development models is Kirkpatrick & Kirkpatrick’s (2006) Four Level Evaluation model of training programs which can be applied to the evaluation of workshops, presentations, consultations, and more long-term programs such as e-learning courses and course design academies. According to the authors of the model the evaluation should target following four learning outcomes:

  1. Reaction: Individuals’ reactions to the learning experiences
  2. Learning: Individuals’ acquisition of important information, knowledge, skills, and attitudes
  3. Transfer: Changes in behavior in an authentic setting due to the training program
  4. Results: The effect/impact that the newly acquired behaviors have on the people and the organization

The authors argued for a stepwise evaluation process [Kirkpatrick & Kirkpatrick’s (2006)] starting from level 1 and moving upwards: as simplistic as reaction measures are, the positive reactions to programs you offered are very likely to draw other people to your center. They also argued that in most occasions in order to get better results, behavior change is a prerequisite. But for behavior change to take place, the training program has to develop the knowledge, skills, and attitudes needed. Finally, to be able to determine that the results are due to the training program, Levels 2, 3, and 4, must be evaluated (Kirckpatrick, 2006). Table_LevelsofOutcomeEvaluation

In this table I summarize Kirckpatricks’ model with reference to educational development programs offered at higher education institutions and I suggest methods and strategies for assessing intended outcomes at these four levels.

Level1: Reaction

Most frequently at the conclusion of an event or more long-term program educational developers collect “immediate feedback” either hand-written or electronic through surveys such as feedback forms and questionnaires. The surveys need to cover different aspects of the event (e.g., content relevance, facilitator preparedness, scheduling, shared materials) and include both rating items and open ended questions such as “What would have improved the program?” More formative feedback can be gained during the workshop via questioning to determine whether participants are following, whether they find the information relevant to their practices and contexts, and what topics can be further developed. Assessment techniques such as minute papers, which require from participants to list the main ideas and points, discuss what they found surprising, and even raise a significant or unanswered question, can also be used. Whatever method you choose to determine participants’ reactions it would be more valid if it is at close proximity to the event and it allows participants to provide comments about the event, suggestions for improvement, and desire to attend other similar events in the future.

Level 2: Learning

The learning outcomes of many of these educational development events is for the participants to acquire pedagogical knowledge and skills but also acquire or adapt beliefs and attitudes relevant to teaching and learning. To determine change in these areas pre-post measures can be used including self-reports (e.g., confidence and self-efficacy measures) as well as more objective assessments for example concept inventories and case studies. Moreover, micro-teaching is another method that allows instructors to practice specific teaching skills and build confidence for their teaching and can also be used to assess and provide feedback on skills developed during the events. A brief rubric for providing feedback would be a useful assessment tool if using micro-teaching exercises. Participants self-reports such as reflections and journals can be coupled with the above more direct measures to examine participants’ perceptions of the valuable knowledge, skills, ideas, and perspectives gained at the event. Finally, another useful evaluation strategy is to request from the participants at the conclusion of an event to write implementation intention statements, which, refer to explicit and specific statements of how participants plan to use and apply what they have learned to attain a goal. The statements can be used by the educational developer to determine whether participants can make connections across contexts to implement what they learned at the event in their classroom and these are a good way activate participants’ motivation and planning behaviors to start making changes in their practices based on what they have learned.

Level 3: Transfer

Most educational developers would agree that among their priorities is to support instructors in improving their teaching and professional practices. Thus, measuring the degree to which participants in events and programs have changed their practices is an important but often challenging task since it requires the consent and engagement of the instructors in this process. The sources of evidence can range on the direct-indirect data continuum. Direct evidence can originate from a classroom observation with or without video-recording. During the observation it is advisable to use an observation rubric and follow up the observation with a discussion with the instructor. A Small Group Instructional Diagnosis (SGID) focus group with students can precede and follow a classroom observation to provide data on the changes that the instructor has pursued in their teaching and how students’ belief these changes have affected their learning. Another approach to directly documenting transfer of learning from educational development events is conducting a document analysis of course materials such as syllabi, teaching materials and assignments, and assessments to examine the degree to which they align with principles, knowledge, and skills acquired through the events. Finally, instructors’ self-assessments and reflections about the changes in their teaching and their outcome is another potentially rich data source that can be gathered via surveys or analysis of the instructors teaching portfolios or promotion and tenure dossiers.

Level 4: Results

The primary goal of most higher education institutions is to ensure high quality learning experiences for all of their students. For educational developers who aim to evaluate the results of their services and programs in order to determine the degree to which the programs had an impact: on (a) the quality of an instructor’s teaching/professional practice and (b) their students’ learning, the data collection process must be carefully planned to avoid drawing unjustified conclusions. Furthermore, I agree with Rohdieck, Plank, and Kalish (2011) that educational developers “may not be able to measure directly (or take credit for) the learning in the classroom” but still, “we can demonstrate how our work supports that goal”. Collecting data from different sources to triangulate the information is crucial to demonstrate that the quality of teaching has improved and students are learning better. With the permission of the instructor, faculty can conduct peer reviews of teaching based on classroom observations. In addition, students’ evaluations of teaching both mid-and-end of semester as well as across semesters’ comparisons of these evaluations can provide information about areas of strength and areas that still call for improvement. Additional evidence for effective teaching and high quality learning can stem from a selective examination of student assignments and performances across semesters to explore whether attending faculty development events is indirectly relevant to better student learning outcomes. Finally, document analysis of curriculum reform materials such as graduate student training manuals, syllabi of redesigned or innovative courses, reports of departmental curriculum reform efforts can point to the broader organizational impact of participating in educational development events such as future faculty programs, graduate student orientations and course design institutes.

Designing and implementing program evaluation initiatives as suggested by Kirkpatrick requires strategic planning, collaborative effort among the center consultants and the clients they serve, financial resources and IT support to develop electronic databases to facilitate the evaluation process. Prioritizing the programs, events and services that you would like to collect evidence of their effectiveness is a good place to start before you start implementing any evaluation activities. A comprehensive organizational chart of how a teaching center (CRLT, University of Michigan) measures the effectiveness of the services can be found in here.

What are some of the evaluation methods and strategies you have tried at your teaching center? When were your clients most responsive? Are there any guidelines you would provide to other colleagues who would like to move beyond the reaction measures?


Kirkpatrick, D. L. (2006). Seven Keys to Unlock the Four Levels of Evaluation. Performance Improvement, 45 (7), p.5-8.
Kirkpatrick, D. L. & Kirkpatrick. J.D. (2006). Evaluating training programs: The four levels, 3rd ed. San Franciso: Berrett-Koehler Publishers.
Rohdieck, S.V., Kalish, A., & Plank, K. M. (2012). Assessing Consultations. In K. T. Brinko, (Ed.), Practically speaking: A sourcebook for instructional consultants in higher education (2nd ed.). Stillwater, OK: New Forums Press.
Wright, M. C. (2011). Measuring a Teaching Center’s Effectiveness. In Cook, C.E. and Kaplan, M. (Eds.) Advancing the Culture of Teaching on Campus (pp. 38-49). Sterling, VA: Stylus Publishing, LLC.

Four Levels of Evaluation [image file]. Retrieved Dec 22, 2013 from http://etec.ctlt.ubc.ca/510wiki


Leave a follow-up post

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s