Evaluation or Building Evaluation Capability? Three Key Elements by Dr Paul
of the things about being an evaluator sitting on this side of the world, in
New Zealand, is that even if you follow the literature you get something of
a sense of isolation from the debates happening within the evaluation
discipline in the United States. At the same time, somewhat paradoxically,
you draw extensively on much of that literature in order to inform your
methodologies and approaches.
as a discipline here is relatively tiny and has been shaped by both the size
of the country (3.6M, about the size of a number of cities in the U.S!) and
the particular demands of our system of government administration. Our size
creates various constraints on our work as evaluators. Our evaluation
budgets are generally very small by U.S. standards. Often programs and
policies are introduced nationally, so there is no sense in which comparison
communities (as in different States) can be used to evaluate outcomes. Even
in those instances where programs are introduced locally and evaluation
designs set up, they often attract publicity, or word of mouth networking,
which can contaminate comparison groups.
the positive side, evaluators, as with everyone else in a small country, are
forced to be multiskilled generalists; we have to work closely with other
professions; it is relatively easy to network with a large proportion of the
stakeholders from any one sector we work in; and access to senior policy
makers is not particularly difficult. In addition, New Zealand has a working
(there is a variety of opinion on how well it is working) treaty between the
European population and the indigenous Maori, the
Treaty of Waitangi. This has forced us to start to come to terms with
the power relations in evaluation and to try to develop appropriate ways of
evaluating Maori programs. While there are still a lot of politics and
practicalities to sort out on this issue, there have been some positive
developments. In particular, the recognition that there needs to be more
real autonomy for Maori to evaluate programs in ways that are appropriate
for them. In addition, many lessons have been learnt about evaluating
community-based programs in the process. This paper does not attempt to
summarise the extensive work that has been done by Whariki in undertaking
evaluations of Maori programs and developing evaluation methodologies which
work with such programs (Moewaka
Barnes 2000; Moewaka Barnes In Press)
this leaves one with some uncertainty as to what may or may not be relevant
for North American evaluators from our somewhat unique perspective over
here. The author has been working in evaluation with Professor Sally
Casswell, and other colleagues at the Alcohol & Public Health Research
Unit at the University of Auckland over the last decade and a half. During
this time we have had the luxury of working with one sector, the public and
community health sector, to build evaluation capability in the sector as a
whole. We have done this through developing appropriate models for
evaluation, producing resources, running training programs at a range of
different sector levels, and engaging with policy makers in discussions
about approaches to evaluation. The author was involved in much of the early
work on this and has continuing involvement but also much of the teaching,
research and innovative work has been undertaken by the colleagues listed in
the acknowledgements to this paper.
was therefore with great interest that I saw the Presidential Strand of the
American Evaluation Association’s 2001 conference was on ‘mainstreaming
evaluation’. In a sense this is exactly what we have been trying to do for
the public and community health sector over the decade and a half we have
been working with it. On occasion, the author has looked at the small size
of the evaluation profession in New Zealand as a disadvantage to promoting
evaluation in New Zealand. However, reflecting on the Presidential Strand
theme, perhaps this has forced us to attempt to ‘mainstream evaluation’
all along. We had no other strategy if we were going to get any significant
evaluation done in New Zealand. While, as evaluators, we worked on a
significant number of projects ourselves, and there were other evaluators
working in the field, the bulk of evaluation needed to be done by the sector
itself if it was to be done at all.
perspective from afar
at the background paper on the Presidential Strand it says that the “role
of evaluation in organizations, including community agencies, government
agencies, schools, and businesses has been marginalized in most cases”. It
is always interesting to look back on the roots of a phenomenon. How have we
come to this point where evaluation is seen as being marginalized, at least
in the U.S? What has the situation looked like from a distance?
Well, going right back to the mid 1970’s when the author was just
starting to study evaluation and the evaluation heroes of the 1960’s and
70’s were in vogue, it all seemed so straight forward. In those halcyon
days evaluation knew where it was going. We were building the
“experimental society” of Donald Campbell (Campbell
. Valiantly we were applying what had been learnt in the other
sciences to large-scale evaluations of social programs. We just had to get
the evaluations done and feed them through the appropriate political
channels and then their results could be implemented. We were well on the
way to moving towards a modernist utopia – somewhere where increasingly
rational, empirically based policy making meant we had maximized our chances
of achieving our social policy goals.
along came Carol Weiss (Weiss
and pointed out that policy makers are driven by many different
pressures in addition to our hard won empirical evaluation results. She
scaled our expectations back to the concept of enlightenment – the gradual
process of results seeping through into decision making through various
channels. But lets face it, who wants enlightenment when we thought we were
going to get direct traction on decision making through us delivering policy
makers the facts? So evaluation got its first lesson in marginalization –
we were only one voice amongst many in the decision making process. Our
initial expectation that we would take center stage in the decision making
process – be mainstreamed as it were – was not met and we had to develop
a more sophisticated view of how we as evaluators could influence policy
any event, we still had our empirical methods and we could continue with
collecting objective data and in time it might flow through to policy makers
and influence their decisions. Then we got hit by post-modernism, as it
started its systematic deconstruction of traditional quantitative objective
social science and, by implication, evaluation. Philosophy of science issues
had always loomed large in evaluation as many of us stake our reputations on
our knowledge claims being somehow more ‘objective’ than the multitude
of subjective stakeholder claims being made for and against programs and
policies. Guba and Lincoln led the charge with their view that knowledge was
socially ‘constructed’ in contrast to the view that objective truth
could be ‘discovered’ (Guba and Lincoln 1989)
. This was a problem in that many of us had staked our profession on
getting ‘the facts’ to managers and policy makers. Here we had
evaluators questioning whether we were ever going to be able to get
objective hard data which was valid and reliable and, most importantly,
addition, indigenous people the world over, African Americans in the U.S.
and other groups who have been truly ‘marginalized’ on the fringes of
economic, social and political life, started to emerge from having being
silenced by the previous centuries of colonialism and exploitation in its
various guises. They highlighted the mono-cultural origins of the methods,
concepts and approaches being used in research and evaluation and pointed to
the role of evaluation and research in the politics of oppression (Smith
. In addition to this being a general challenge to evaluation, it was
particularly telling in that many of the participants in the social programs
that evaluators were working on were indigenous people or members of other
problemization of evaluation being an objective fact-finding exercise in the
Western scientific tradition was accompanied by the realization that there
were multiple stakeholders in any evaluation and that it was a value
judgment as to which stakeholders’ world views should be reflected in an
evaluation. Some pushed this right through to the point of arguing that
evaluation should be all about a process of ‘empowerment’ for those
involved in programs (Fetterman,
Kaftarian et al. 1996)
. Indigenous people
started to develop methods for undertaking evaluations that were consistent
with their cultural traditions (Moewaka
Barnes 2000; Moewaka Barnes In Press)
. All of this turmoil and debate resulted in a massive increase in
the number of approaches and techniques available to evaluators. All of the
qualitative and interpretative techniques from social science became
available as legitimate evaluation methods standing alongside the
quantitative techniques that had been honed by evaluation in the 1960’s
the repertoire of techniques expanded, the debate about approaches
continued. Some valiantly maintained that there was the possibility of a
workable objectivity in evaluations (Scriven
. Others adopted the middle ground arguing in a pragmatic fashion for
‘utilization focused’ evaluation which took as its starting point the
various audiences for evaluation and had the evaluator simply work out which
audiences they were working for and then design their evaluations to meet
their needs (Patton
. With this emphasis on
diverse stakeholder audiences, formative evaluation (improving program
implementation) and process evaluation (describing program processes) took
their place alongside outcome evaluation in the evaluator’s repertoire. By
the end of last century evaluation had a range of sophisticated quantitative
and qualitative tools for dealing with evaluating programs which recognized
the complexity of social programs, the epistemological and design issues
which need to be faced, and in at least some contexts it had started to
address the complex issue of evaluation approaches appropriate for
indigenous peoples and other cultural groups. We were ready to take on the
world, they should have been calling out for us to be mainstreamed.
what happened then in the 1990’s? Particularly in the U.S., large sections
of the policy community completely ignored the sophistication of our
profession and headed off independently into the Performance Management
Movement (Blalock 1999)
. This is the often naïve attempt to measure a range of indicators
and to use them (with little regard to attributing causality) for holding
social programs to account - a sort of ersatz outcomes evaluation with none
of the sophistication we had spent decades of hard work developing.
Evaluators were left in the role of sideline critics complaining about the
efforts being undertaken in performance management as too simplistic (Greene
. This was while the policy makers and performance measurement people
got on with the party, usually in blissfull ignorance of the difficulty of
what they were claiming they were going to be able to achieve.
Marginalization for evaluation again, ironically, this time not through
having a too simplistic view of the world, but through having a too
sophisticated view of what could and could not be measured in regard to
programs and policies. So given this history, what should evaluation as a
discipline now do in the face of marginalization this time around? The
obvious answer, given by the AEA 2001 Presidential Strand theme is to try
once again to ‘mainstream evaluation’.
What exactly is meant by
It is important to start by thinking about what
it is that we are tying to achieve in mainstreaming evaluation. We can start
thinking about this by using the traditional formative evaluator’s trick
of “looking behind” the strategy we are proposing - mainstreaming
evaluation - in this case. We are looking for the goal that this strategy is
attempting to achieve. Presumably, the purpose of mainstreaming evaluation
is to get our organizations, policies and programs to be more effective and
efficient. We can work back
from our goal using a program logic approach - what is it that is needed to
achieve this goal? Presumably it is that people throughout our organizations
and policy making processes are being more evaluative about what they are
doing. Note this is not saying that they should all be calling what they are
doing ‘evaluation’. What is needed to ensure that people become more
evaluative? They will have to have appropriate evaluation skills, systems,
structures and resources to support them in taking a more evaluative
approach to their work.
at it this way, the task of mainstreaming evaluation may be better put as
one of building evaluative or evaluation capability throughout our
programs, organizations and policy development. Such evaluative activity may
not necessarily be labeled evaluation, but is should, regardless of what it
is called, contribute toward the goal of people being more evaluative about
what they do and finally make our organizations, policies and programs more
effective and efficient.
first sight, this idea of building evaluation capability sounds synonymous
with mainstreaming evaluation. Whether or not it is, however, depends on
exactly what evaluators mean when they use the term mainstreaming
evaluation. For the author the term mainstreaming evaluation has the
potential implication that we are tying once again for evaluation to take
center stage. As evaluators we
need to think through the extent to which our desire to mainstream
evaluation is an attempt to grow the profession in contrast to getting
people to be more evaluative. Looking at it as the attempt to get people to
be more evaluative, we may need to be prepared to
‘give evaluation away’ in order to build evaluation capability.
Giving evaluation away means sharing skills and approaches without these
necessarily being labeled as ‘evaluation’ by those who use them. This is
in contrast to looking for opportunities for growing the size and power of
a sense, the growth of evaluation as a profession may be part of the problem
in its marginalization. Professionalization of evaluation can reify
evaluation and make it a ‘separate’ activity which may or may not be
‘done’ to programs. People ask the question: are we going to do an
evaluation of this program? with the implication that they have to
decide as to whether they will call evaluators in to do a separate piece of
work. Ideally, people should see evaluation as a central task they own
themselves and they may or may not have to involve outside evaluators in
what should be seen as a core task for the business as a whole.
of this is not to say that there is not plenty of room for the evaluation
profession to continue and, in fact, thrive. There are enormous technical
challenges in designing some types of evaluation and specialist evaluators
will always have to be involved in these. All that is being argued here is
that the most useful strategy in attempting to mainstream evaluation is
probably to try and ‘give it way’ rather than expect that evaluation as
an entity itself can be mainstreamed.
how can we go about giving evaluation way or building evaluation capability
in our organizations? The author has been working on this question over the
last decade and a half in New Zealand. In the 1980’s working with
Professor Sally Casswell at the University of Auckland in the public and
community health field we became interested in the best way in which to
build evaluation capability for the sector as a whole. We were involved in
running a large number of training workshops for people from various levels
within the sector and being involved in undertaking and consulting on the
methodology for many evaluations. The author was involved in a number of
these and many others were undertaken by colleagues at the Alcohol &
Public Health Research Unit and Whariki, the Maori research unit working in
partnership with APHRU. From his experience working with colleagues at the
University of Auckland and his experience working as a consultant evaluator
in a number of other sectors over this time, the author believes that there
are three key elements required to build evaluation capability. Each of
these is discussed in below.
Three Key Aspects of Building
three key aspects of building evaluation capacity are:
Using an appropriate evaluation model
Developing evaluation skills appropriate for each level of an
organization or sector
Organizational or sector level strategizing to identify priority
evaluation questions, rather than just relying on evaluation planning at the
individual program level.
of these is discussed here. They are put forward as suggestions rather than
definitive answers and the author would appreciate the opportunity to
discuss all aspects of these and other approaches to mainstreaming
evaluation at the AEA 2001 Conference.
1. Using an appropriate
an appropriate evaluation model may seem a slightly obscure and theoretical
place to start in thinking about building evaluation capability. It is
important, however, to think about the most useful way of describing
evaluation for the particular purposes of building evaluation capability.
There are a number of different ways in which evaluation can be described
and a number of different typologies that are used. Terms used for aspects
of evaluation include: quasi-experimental
design, formative, developmental, implementation, process, impact, outcome,
summative, stakeholder, empowerment, goal-free, utilization focused, fourth
generation and naturalistic (Cook
and Campbell 1979; McClintock 1986; Patton 1986; Guba and Lincoln 1989;
Rossi and Freeman 1989; Scriven 1991; Fetterman, Kaftarian et al. 1996;
Chelimsky and Shadish 1997)
. As in any discipline, these terms are at various conceptual levels
and are used in various ways by various evaluators for various purposes.
What type of evaluation model or typology is then the most appropriate for
building evaluation capability?
would be useful for a set of criteria to be developed to assist in
determining which evaluation models are the most appropriate for building
evaluation capability. A provisional list of criteria has been developed by
the author as follows. Appropriate evaluation models for capability building
to demystify evaluation by positioning evaluation as any activity directed
at answering a set of easily understood questions
a set of evaluation terms which emphasize that evaluation can take place
- across a program’s lifecycle and is not limited to just outcome
a role for both internal and external evaluators
methods for hard to evaluate, real world programs, not just for ideal-type
- large scale expensive external evaluation designs
privilege any one meta-approach to evaluation (e.g. goal-free,
evaluation models meet these criteria better than others. Each of the
criteria is discussed below:
positioned as answering a set of easily understood questions
first essential aspect of an appropriate evaluation model for evaluation
capability building is that it is simple to understand. It needs to be able
to be explained in simple terms to a wide range of different stakeholders
with diverse training, backgrounds and experience. Evaluation discussions
can very rapidly become highly technical. Michael Foucault, the doyen of
postmodern thought, has illustrated how technical language is used by groups
of professionals to exercise power over others (Foucault
. Evaluation is no exception to this. We all know that when an
evaluator turns up talking about quasi-experimental design, the regression
discontinuity approach or, more recently, discourse analysis, there are not
a lot of people in the room who are going to feel they are on equal terms in
the discussion. When working with people employed in community
organizations, as are many of the people involved in the public and
community health sector, it is particularly important to find a model which
relates evaluation to their day to day work experience.
we are to build evaluation capability into our programs, organizations and
policy development we need a simple and easily comprehensible starting point
for an evaluation model. One such starting point is to say that evaluation
is simply about asking questions. These questions are not something that
evaluators alone should attempt to answer themselves; they are questions
that should be an important concern of every policy maker, manager, staff
member and program participant. The questions the author uses to position
the evaluation model he uses, are firstly the overall evaluation question
for any organization, policy or program:
this (organizational activity, policy or program) being done in the best
are then three major subsidiary questions that can be unpacked from this:
can we improve this organization, program or policy?
we describe what is happening in this organization, program or policy?
this organization, program or policy achieved its objectives?
these questions is not something that is solely the responsibility of
evaluators. They are questions which everyone in any organization should be
asking themselves all the time. This question-based introduction to
evaluation helps to demystify the process of evaluation. It puts the
responsibility for evaluation back where it belongs, on the policy makers,
funders, managers, staff and program participants rather than leaving it to
evaluators. It points out that managers and staff cannot avoid these
questions; they just have to work out ways of answering them through their
own efforts and identifying when it is appropriate to call in specialized
evaluation help. In an ideal evaluation design, funders, program planners
and stakeholders have the opportunity to work together on defining the
questions which an evaluation should be asking. This approach to positioning
evaluation also sets the scene for promoting the idea of organizational or
sector level strategizing of priority evaluation questions which is
evaluation in this way in training workshops is usually greeted with
participant feedback that it has demystified evaluation for them, made them
realize that they are already doing considerable evaluation themselves, and
that there are other techniques they could be using to have a more
evaluative approach to their day to day work.
typology with terms right across program lifecycle
New Zealand at least, most stakeholders unfamiliar with evaluation still see
it as mainly just about outcome evaluation. An appropriate evaluation model
should use terms that emphasize the fact that evaluation can take place
right across a program’s lifecycle. One dichotomy used by practicing
evaluators in describing evaluation types to stakeholders and which assists
with this is the formative/summative distinction. This highlights the
importance of formative evaluation. While this distinction continues to be
useful in some contexts, it does not go far enough in emphasizing that
evaluation is something that is spread right across a program life-cycle,
the formative/summative split can be interpreted by some to just imply
activity at the beginning and at the end of a program. Another dichotomy
that is used by practicing evaluators in discussions with stakeholders is
the process/outcome distinction. This again has its uses. However just
relying on it in evaluation practice sometimes leaves the impression that
people believe process evaluation is a full alternative to outcome
Alcohol & Public Health Research Unit in its work in evaluation
capability building incorporates elements from both of these dichotomies and
uses a three way split for evaluation between formative, process and outcome
and Duignan 1989; Duignan 1990; Duignan and Casswell 1990; Duignan, Casswell
et al. 1992; Duignan, Dehar et al. 1992; Turner, Dehar et al. 1992; Duignan
1997; Waa, Holibar et al. 1998; Casswell 1999)
evaluation being defined as: evaluation activity directed at optimizing a
program. (It can alternatively be described as design, developmental or
evaluation being defined as: describing and documenting what happens in
the context and course of a program to assist in understanding a program,
interpreting program outcomes and/or to allow others to replicate the
program in the future. Note this narrows the definition of process
evaluation by separating out the formative evaluation element.
evaluation being defined as: assessing the positive and negative outcomes
of a program. This includes all sorts of impact/outcome measurement
recognizing that outcomes can be short, intermediate or long term and also
arranged in structured hierarchies, e.g. individual level, community level,
of these terms are opposed to each other, they are seen as the three
essential aspects of evaluation. The three terms can be directly related to
the three subsidiary evaluation questions identified in the section above.
Formative evaluation asks the question: how can we improve this
organization, program or policy? Process evaluation asks the question: can
we describe what is happening in this organization, program or policy?
Outcome evaluation asks the question: has this organization, program or
policy achieved its objectives?
three types of evaluation in this model can easily be related to different
stages in program development: the start, middle and end of a program. This
encourages thinking about how evaluation can be used right across a
program’s lifecycle, each type of evaluation – formative, process and
outcome – must be individually considered as a possibility for evaluation
activity. If outcome evaluation proves too expensive or difficult there
still may be useful questions that can be answered in regard to formative
and process evaluation. Where outcome evaluation is difficult, because of
not having a model that emphasizes the options of formative and process
evaluation, it can drive people into pseudo-outcome evaluation such as that
which often typifies the Performance Management Movement.
evaluators, this discussion will hardly seem like rocket science. It has
just outlined one of the many ways of splitting up the evaluation process.
The reason this particular model is described here is that the
characteristics of the evaluation typologies and models we intend to use for
building evaluation capability need to be scrutinized. There may be better,
or alternative models, to the one described here. The author is simply
interested in generating discussion about the most useful models and
typologies for evaluation for the particular purposes of building evaluation
capability. The approach
outlined here does deals with some of the pitfalls which arise from other
evaluation models and typologies when used for capability building with
and external evaluators
appropriate evaluation model for building evaluation capability must allow
for the possibility of both internal and external evaluators. If evaluation
is just seen as something that is undertaken by external experts then there
is little reason for internal staff to develop their evaluation skills.
Of course, depending on the size of an organization, internal
evaluators may have considerable distance from an actual program being
evaluated. A good way of looking at this issue is not to think so much in
terms of external or internal evaluators, but rather to see evaluators as
potentially being on a continuum running from close involvement through to
little or no involvement in a program.
appropriate evaluation model for evaluation capability building needs to be
able to describe the pros and cons of closely involved evaluators through to
less involved evaluators. It also needs to have developed ways of managing
the risks around evaluators’ level of involvement based on: the purpose of
the evaluation; the type of evaluation work being done; whether the
evaluators are working in teams which include roles with various levels of
involvement with a program; and the extent to which the data sets being
collected and analysed are highly responsive to bias. If an evaluation model
does not deal with these issues then it is of little use in building
evaluation capability as it maintains the fiction that evaluation can only
be undertaken by outside experts.
Methods for hard to evaluate
real world programs
appropriate evaluation model for capability building also needs to
incorporate methods that can be used to evaluate a wide range of real world
programs where there may be very limited resources available for an
evaluation. This means that it must include practical evaluation methods
that can be undertaken by program staff wanting to evaluate their programs.
These are often formative and process evaluation techniques.
area where new evaluation models are crucially important is in the area of
community programs. Building on the work of writers (Freire
the use of community-based strategies has swept through public and
community health (Labonte
. Indigenous people have also been demanding programs that respect
the autonomy of their communities and use methods which are consistent with
the way in which their communities operate. Community approaches are
therefore now being adopted in a wide range of social program areas in
addition to public and community health.
community-based programs presents interesting sets of challenges for
evaluators and raises enormous difficulties for traditional models of
evaluation. Community programs have long time frames, they take place in
communities where many other programs are running at the same time, often
with the same goals. Even more challenging, they are usually based around a
philosophy of community autonomy. This presents significant challenges to
evaluation looking at whether a program has met its objectives. If a set of
objectives is proscribed by a funder for a program, it is likely that the
communities involved in undertaking such programs will want to set their own
objectives. Which set of objectives should be evaluated against, those set
by the funder or those set by the community program itself?
author and his colleagues have had considerable experience in dealing with
evaluating these sorts of programs (Duignan
and Casswell 1989; Duignan and Casswell 1992; Duignan, Casswell et al. 1993;
Casswell 1999; Casswell 2000; Moewaka Barnes 2000; Moewaka Barnes In Press)
. There are models that can be used but these require considerable
innovation on the part of evaluators. If we are to build evaluation
capability we need to expand and refine these models so that they become
better at dealing with the realities of real world programs rather than just
idea-type social experiments.
Not privilege any one
meta-approach to evaluation
to evaluation are evaluation styles that endorse a particular solution to
the philosophy of science questions that lie behind evaluation. Philosophy
of science questions are always close to the surface in discussing
evaluation because it is about making judgments about organizations,
policies and programs. Goal-free evaluation and empowerment evaluation are
good examples of meta-approaches to evaluation that take different
philosophy of science positions (Scriven
and Kramer 1994)
. It is fine for evaluators to adopt one or other of these
meta-positions in their professional work as evaluators. It is also fine for
them to argue that their approach should be the basis for evaluation efforts
in particular settings and situations. However, in building evaluation
capability it is important that a more catholic approach is taken to
evaluation that does not eliminate one form of evaluation that some
stakeholders may find useful. Of course, the Western evaluation approach
itself can be seen as just one meta-approach to evaluation and we need to be
aware that this itself is not universally accepted by stakeholders. In New
Zealand at least, Maori are actively involved in the process of developing
evaluation models and approaches which may or may not have similar
assumptions, methods, and techniques to evaluation as it is practiced in the
Western tradition (Moewaka Barnes 2000; Moewaka Barnes In Press)
. Hopefully the fertile debate between different meta-approaches to
evaluation will continue to feed thinking and practice in evaluation as it
has done in the past. It is
important that in attempting to build evaluation capability we encourage
different approaches to evaluation.
section has set out and discussed the criteria for evaluation models that
are appropriate for evaluation capability building. Some suggestions have
been put forward regarding what an appropriate model may look like. The main
challenge in building evaluation capability is to think through how our
models (and there will need to be multiple models for different
stakeholders) need to be different when we are using them for capability
building to when we are using them amongst ourselves as evaluators.
second key element in building evaluation capability – training and skills
development – is considered next.
Appropriate evaluation skills
training at all levels
- The second step in building
evaluation capability is to develop skills, systems and structures for
evaluation activity at all levels within organizations and sectors. From the
author’s experience in the New Zealand context, and this may well apply in
the U.S. and other countries, there is a lack of both an adequate
conceptualization of, and skills in, evaluation right across all
organizational levels. Policy makers, funders, service provider management,
staff and program participants generally tend to have a relatively limited
understanding of evaluation. If they do have more than this it is often
based on the erroneous view that evaluation is just about outcome
objective of skills development in evaluation for an organization or sector
is to both further sophistication about evaluation along the lines of the
evaluation model discussed above and to teach appropriate specific
evaluation skills to those who can use them in their day to day work. This
can be done by developing manuals and training resources and by running
the Alcohol & Public Health Research Unit and Whariki a series of
manuals on evaluation have been developed that reflect the evaluation model
described above and have been widely distributed throughout the public and
community health sector in New Zealand (Casswell
and Duignan 1989; Duignan, Dehar et al. 1992; Turner, Dehar et al. 1992;
Waa, Holibar et al. 1998)
. These manuals deliberately copied the visual style of an early key
sector document on health promotion (New Zealand Board of Health Committee on Health
in order to have them seen as having continuity with sector
documents rather than being “evaluation” documents external to the
sector. As time has gone on the response to these manuals has been evaluated
and subsequent manuals have been amended on the basis of this feedback.
the period of time that the resources have been available, the Unit and
Whariki have carried out a series of training workshops for different
audiences within the sector. The different types of training are:
presentations on evaluation at a range of sector workshops on other issues.
Typically these are for one to two hours covering the general evaluation
model and principles and raising awareness of evaluation within the sector.
day Level I courses for service provider lower level managers and staff
where they can discuss the overall evaluation model and learn specific
evaluation skills which they can use in their day to day work. Considerable
time is spent demystifying evaluation and describing simple formative and
process evaluation methods that can be used by service provider staff.
Outcome evaluation methods that can be used are discussed and indicators as
to when other evaluation expertise needs to be drawn in.
advanced Level II two day courses for service provider managers and staff
wanting to develop their skills. These provide more indepth training in
week long workshops for policy makers, funders, larger provider specialists,
and researchers to develop and practice appropriate evaluation skills. These
cover the evaluation model and the skills and techniques discussed in the
Level I and II workshops in more depth, with further discussion of outcome
specifically run by Maori evaluators for Maori program mangement and staff.
These look at evaluation concepts and methods from a Maori perspective.
day overview workshops to discuss evaluation concepts and approaches for
service provider management. These discuss the concepts from the evaluation
model and how these relate to organizational policies and practices. For
example, distinguishing between performance management and evaluation; what
is and what is not realistic to expect in terms of outcome evaluation; and
setting organizational priorities for evaluation.
day workshops for staff and management within an organization discussing the
model, concepts in evaluation and the idea of prioritizing evaluation
questions across the organization as a whole.
university Masters papers for researchers and practitioners interested in
further developing their understanding of evaluation and their ability to
of these courses, apart from the managers’ courses, involve both
discussion of evaluation models combined with hands on working with
evaluation projects brought to the workshops by participants. This action
learning approach ensures that participants go away with a feeling of
mastery in at least some evaluation techniques, which further assists in
promoting the idea that there are aspects of evaluation which can be done by
people at all levels within a program, organization or sector.
idea is that the end result of this ongoing activity will be a sector which
first has a much more sophisticated model of evaluation; second, is in a
position to talk about evaluation questions within the sector; and third,
all levels within the sector know how to undertake some evaluation tasks
appropriate to their work situation.
Organizational or sector level strategizing to prioritize
- The third and
final aspect of building evaluation capability discussed in this paper is to
encourage organizational or sector strategizing about what are priority
evaluation questions for an organization or a sector. It is not that there is
necessarily not enough evaluation being done in a sector. Funders may be
routinely demanding evaluation and service providers having evaluations done.
The problem is whether evaluation resources are being used in the most efficient
way possible. Current practice in New Zealand, which may well also be the case
in the U.S., is for the following to happen.
- An evaluator
often gets called in to advise on evaluation methodology for a program. Often
there are enough resources to do ‘an evaluation’. The problem is that there
is often very little thought or guidance given by funders or others in the
organization or sector as to what are the priority evaluation questions that
should be being asked in this particular evaluation.
- In theory, of
course, the evaluator can use one of the stakeholder evaluation or related
approaches and consult with the various stakeholders about what they see as the
priority questions for evaluating the program. All evaluators will do this to
some extent. The problem is that it does not really make sense to do this
repeatedly on a program by program basis, particularly when the programs are
relatively small. The program in question may or may not be a priority for
evaluation. The evaluation resources may be better spent elsewhere and perhaps
only a small scale evaluation should be undertaken for the program in question.
For instance, it may be a well established program for which (contrary to most
instances) good formative evaluation has already been carried out, and some
previous work on similar program has shown some positive outcomes. It may be
better to spend the evaluation resources that are available on an entirely
different novel initiative which is exploring new ways of dealing with the
social issue the original program is attempting to address.
- The key point
is that there should not be an assumption that any particular type of
evaluation, or any particular scale of evaluation is suitable for all programs.
There is unfortunately this assumption in the notion that a program ‘needs an
evaluation’. The assumption, from funders at least, is usually that this will
consist of an outcome evaluation which will be able to accurately determine
whether the program has met its objectives. In the author’s experience,
funders usually give no thought to how feasible or expensive an outcome
evaluation may be, they just pass the problem onto the service provider
management and staff and any evaluation specialists they employ.
- Of course, in
addition to evaluation in the sense being used here, all programs need basic
monitoring as to whether they are on track using cheap, routinely measured
performance indicators. This is usually best dealt with as separate process,
which can be linked to more complex evaluations where these are undertaken. The
essential distinction is that monitoring should be undertaken on a routine basis
for accountability and evaluation, because of its higher cost, used on a more
selective strategic basis.
- In contrast to
the current, program focused, approach to evaluation planning, there should be
an organizational or sector based approach to evaluation question
prioritization. Rather that simply
saying that “every program needs an evaluation” it would be much more
fruitful to say “how can we best spend our evaluation resources to answer
priority evaluation questions for this organization or this sector?” This
second question is particularly useful if it is put in terms of strategic
planning for the future rather than a futile attempt to use evaluation as a
routine method of achieving accountability – something much of the Performance
Management Movement is yet to discover.
- The author has
been involved in organizational-level evaluation policy setting exercises. These
are where an organization develops an explicit evaluation policy as a
preliminary step to introducing a more strategic approach to the program-based
evaluation work being undertaken by that organization. Typically these policies
contain elements such as:
- The evaluation model(s) that will be used in
- Policies regarding, and opportunities for,
staff training in evaluation
- Sources of, and procedures for, obtaining
technical evaluation assistance
- Procedures and stakeholder consultation
standards for evaluation planning and sign-off
- Procedures and consultation processes in
respect of cultural issues
- Guidelines on the typical scope and type of
evaluation for different size and type of program
- Guidelines on the use of internal and external
- Ethical and other related consideration
- Policies about disclosure of evaluation
- In addition,
some progress has been made in getting organizations to prioritize the
evaluation questions they are asking of the programs which they are running. In
the instances of this where the author has been involved, this has tended to be
a fairly tentative process.
- Of course, in
most instances, one organization’s programs are just part of the overall
activity in a sector. It its even more useful then for prioritization of
evaluation to take place at the sector rather than the organizational level.
In an ideal world, sectors would have systems which give them the
opportunity to reflect on what are the priority evaluation questions that need
answering. They would get an indication of the cost of answering these questions
and then move to make strategic decisions about which type of evaluations for
which programs should and should not take place. This views the evaluation spend
for a whole sector as one large research and development fund which needs to be
spent wisely, rather than trying to make evaluation decisions on a program by
program basis. This is not to say that people from the program level should not
have some input into what are these evaluation questions, just that it should
not all be left to that level.
- Of course, it
can be argued that already a lot of organizational and sector strategic
considerations are factored into the evaluation requirements for an individual
program. Funders will indicate which programs they want evaluated, the level of
resources, and may indicate which evaluation questions they want answered. In
addition, in reviews of the academic literature, and in priority setting
processes within research funding bodies there will be prioritzation happening.
In those sectors where there are ongoing research groups, involved in teaching,
advising and undertaking a large number of evaluations, they will, in part, play
a role through having a strategic view of a sector and which evaluation
questions are the next priority. However, the author believes that this process
is at the current time still too ad hoc and there is often a disjunction between
priority setting, in those instances when it is taking place, and what actually
happens on the ground in regard to the evaluation of the many programs that are
subject to evaluation. Facilitating such prioritization work could be a major
contribution of evaluators to building evaluation capability.
- Exactly how to
facilitate this cross organizational or sector prioritization is a difficult
question. Sectors dealing with social issues tend to be made up of a diverse
range of public and private groups funding a diverse range of programs. There
are some innovative evaluation priority setting exercises going on in New
Zealand at the moment in the labour and employment program area (McKegg, 2000).
In addition, the author is also working with other sectors attempting to get
this sort of prioritization process to occur. It would be good to share notes
with other evaluators as to whether they see this as a priority and if they are
having any success in this area.
- This paper has
looked at the question of mainstreaming evaluation. The author has drawn on his
experiences largely with the public and community health and related social
sectors in New Zealand. In the New
Zealand public and community health sector we have made more progress on the
first two (models and training rather than sector prioritizing) of the three key
elements for building evaluation capability described in the paper.
- The intention
of the paper has been to generate some points for discussion around the issue of
mainstreaming evaluation. The questions that the author would like considered at
AEA 2001 are:
In mainstreaming is there a difference between
building the evaluation profession and ‘giving evaluation way’?
Are the models of evaluation used in
mainstreaming important? If so are the criteria suggested in this paper the
right ones? What models are likely to be best for discussing evaluation with
Can we further develop evaluation approaches
that can deal with the realities of on the ground programs community-based
programs rather than just trying to use text-book evaluation designs?
How can we as evaluators support the
development of indigenous people’s and other groups evaluation models and
What are innovative methods of training for
How can organizational and sector level
evaluation question prioritization exercises be encouraged? Is there a
particular role for evaluators in facilitating these?
Paul Duignan* works half-time as a Senior Lecturer at the University of
Auckland where he teaches program evaluation at the post-graduate level. He
has been involved in evaluation capability building in the public and
community health sector in New Zealand for the last decade and a half. His
PhD was on evaluation methodology for health promotion. He has taught
evaluation method and concepts to people from all levels of the public and
community health sector. He also consults for Parker Duignan Ltd in
evaluation methodology and public policy issues.
S. (1971). Rules for radicals. New York, Random House.
- Blalock, A. (1999). “Evaluation research and the
performance management movement.” Evaluation 5(2): 117-149.
- Campbell, D. T. (1975). Reforms as experiments. Handbook
of evaluation research. E. L. Struening and M. Guttentag. Beverly Hills,
Sage. 1: 71-100.
- Casswell, S. (1999). Evaluation research. Social
Science Research in New Zealand: Many Paths to Understanding. C. Davidson
and M. Tolich. Auckland, Longman.
- Casswell, S. (2000). “A decade of community action
research.” Substance Use & Misuse 35(1&2): 55-74.
- Casswell, S. and P. Duignan (1989). Evaluating
health promotion: A guide for health promoters and health managers.
Auckland, Department of Community Health, School of Medicine, University of
- Chelimsky, E. and W. R. Shadish, Eds. (1997). Evaluation
for the 21st century: a handbook. Thousand Oaks, California, Sage.
- Cook, T. and D. T. Campbell (1979). Quasi-experimentation:
design and analysis issues for field settings. Boston, Houghton Mifflin
- Duignan, P. (1990). Evaluating health promotion: An
integrated framework. Health Promotion Research Methods: Expanding the
Repertoire Conference, Toronto, Canada.
- Duignan, P. (1997). Evaluating health promotion: The
Strategic Evaluation Framework. Psychology. Doctoral Dissertation,
University of Waikato, New Zealand.
- Duignan, P. and S. Casswell (1989). “Evaluating
community development programmes for health promotion: Problems illustrated by
a New Zealand example.” Community Health Studies 13(1): 74-81.
- Duignan, P. and S. Casswell (1990). Appropriate
evaluation methodology for health promotion. American Evaluation
Association Annual Conference, Washington.
- Duignan, P. and S. Casswell (1992). “Community
alcohol action programmes evaluation in New Zealand.” Journal of Drug
Issues 22: 757-771.
- Duignan, P., S. Casswell, et al. (1992). Promoting
change in health promotion practice: A framework for the evalaution of health
promotion. Psychology and social change. D. Thomas and A. Veno.
Palmerston North, The Dunmore Press.
- Duignan, P., S. Casswell, et al. (1993). Evaluating
community projects: Conceptual and methodological issues illustrated from the
Community Action Project and the Liquor Licensing Project in New Zealand. Experiences
with Community Action Projects: New Research in the Prevention of Alcohol and
Other Drug Problems (CSAP Prevention Monograph 14). T. K. Greenfield and
R. Zimmerman. Rockville, MD, U.S. Department of Health and Human Services.
- Duignan, P., M. Dehar, et al. (1992). Planning
evaluation of health promotion programmes: A framework for decision making.
Auckland, Alcohol and Public Health Research Unit, School of Medicine,
University of Auckland.
- Fetterman, D., S. Kaftarian, et al. (1996). Empowerment
evaluation: knowledge and tools for self-assessment and accountability.
Thousand Oaks, CA, Sage.
- Foucault, M. (1973). Madness and civilization: A
history of insanity in the age of reason. New York, Vintage Books.
- Freire, P. (1968). Pedagogy of the oppressed.
New York, Seabury Press.
- Greene, J. (1999). “The inequality of performance
measurements.” Evaluation 5(2):
- Guba, E. G. and Y. S. Lincoln (1989). Fourth
generation evaluation. Newbury Park, California, Sage.
- Labonte, R. (1989). Community health promotion
strategies. Readings for a new public health. C. J. Martin and D. V.
McQueen. Edinburgh, Edinburgh University Press: 235-249.
- McClintock, C. (1986). “Towards a theory of formative
program evaluation.” Evaluation Studies Review Annual 11:
- McKegg, K. (2000). Personal communication.
- Moewaka Barnes, H. (2000). “Collaboration in
community action, a successful partnership between indigenous communities and
researchers.” Health Promotion International 15: 17-25.
- Moewaka Barnes, H. (In Press). “Kaupapa Maori:
Explaining the ordinary.” Pacific Health Dialog.
- New Zealand Board of Health Committee on Health
Promotion (1988). Promoting health in New Zealand. Wellington, New
Zealand Board of Health.
- Patton, M. Q. (1986). Utilization focused evaluation.
Newbury Park, Sage.
- Rossi, P. H. and H. E. Freeman (1989). Evaluation: A
systematic approach. Beverly Hills, Sage.
- Scriven, M. (1991). Evaluation Thesaurus.
Newbury Park, Sage.
- Scriven, M. (1997). Truth and objectivity in
evaluation. C. S. (1997): 477-500.
- Scriven, M. and J. Kramer (1994). “Risks, rights and
responsibilities in evalaution.” Evaluation Journal of Australasia 9(2):
- Smith, L. T. (2000). Decolonising Methodology:
Research and Indigenous Peoples. London, Zed Books.
- Turner, A., M. Dehar, et al. (1992). Doing
evaluation: A manual for health promotion workers. Auckland, Alcohol and
Public Health Research Unit, University of Auckland.
- Waa, A., F. Holibar, et al. (1998). Programme
evaluation: an introductory guide fo health promotion. Auckland, Alcohol and
Public Health Research Unit/Whariki, University of Auckland.
- Weiss, C. (1977). “Research for policy's sake: The
enlightenment function of social research.” Policy Anaysis 3:
Top | Back | Home