Program evaluation methods measurement and attribution
Increasingly, public health programs are accountable to funders, legislators, and the general public. Many programs do this by creating, monitoring, and reporting results for a small set of markers and milestones of program progress. Linking program performance to program budget is the final step in accountability.
The early steps in the program evaluation approach such as logic modeling clarify these relationships, making the link between budget and performance easier and more apparent. While the terms surveillance and evaluation are often used interchangeably, each makes a distinctive contribution to a program, and it is important to clarify their different purposes. Surveillance is the continuous monitoring or routine data collection on various factors e. Surveillance systems have existing resources and infrastructure.
Data gathered by surveillance systems are invaluable for performance measurement and program evaluation, especially of longer term and population-based outcomes. There are limits, however, to how useful surveillance data can be for evaluators. Also, these surveillance systems may have limited flexibility to add questions for a particular program evaluation.
In the best of all worlds, surveillance and evaluation are companion processes that can be conducted simultaneously. Evaluation may supplement surveillance data by providing tailored information to answer specific questions about a program. Data from specific questions for an evaluation are more flexible than surveillance and may allow program areas to be assessed in greater depth.
Evaluators can also use qualitative methods e. Both research and program evaluation make important contributions to the body of knowledge, but fundamental differences in the purpose of research and the purpose of evaluation mean that good program evaluation need not always follow an academic research model.
Research is generally thought of as requiring a controlled environment or control groups. In field settings directed at prevention and control of a public health problem, this is seldom realistic.
Of the ten concepts contrasted in the table, the last three are especially worth noting. Unlike pure academic research models, program evaluation acknowledges and incorporates differences in values and perspectives from the start, may address many questions besides attribution, and tends to produce results for varied audiences.
Program staff may be pushed to do evaluation by external mandates from funders, authorizers, or others, or they may be pulled to do evaluation by an internal need to determine how the program is performing and what can be improved. While push or pull can motivate a program to conduct good evaluations, program evaluation efforts are more likely to be sustained when staff see the results as useful information that can help them do their jobs better.
Data gathered during evaluation enable managers and staff to create the best possible programs, to learn from mistakes, to make modifications as needed, to monitor progress toward program goals, and to judge the success of the program in achieving its short-term, intermediate, and long-term outcomes.
Most public health programs aim to change behavior in one or more target groups and to create an environment that reinforces sustained adoption of these changes, with the intention that changes in environments and behaviors will prevent and control diseases and injuries.
Through evaluation, you can track these changes and, with careful evaluation designs, assess the effectiveness and impact of a particular program, intervention, or strategy in producing these changes. The Working Group prepared a set of conclusions and related recommendations to guide policymakers and practitioners. Program evaluation is one of ten essential public health services [8] and a critical organizational practice in public health.
The underlying logic of the Evaluation Framework is that good evaluation does not merely gather accurate evidence and draw valid conclusions, but produces results that are used to make a difference. You determine the market by focusing evaluations on questions that are most salient, relevant, and important.
You ensure the best evaluation focus by understanding where the questions fit into the full landscape of your program description, and especially by ensuring that you have identified and engaged stakeholders who care about these questions and want to take action on the results. The steps in the CDC Framework are informed by a set of standards for evaluation. The 30 standards cluster into four groups:. Utility: Who needs the evaluation results? Will the evaluation provide relevant information in a timely manner for them?
Feasibility: Are the planned evaluation activities realistic given the time, resources, and expertise at hand? Propriety: Does the evaluation protect the rights of individuals and protect the welfare of those involved?
Does it engage those most directly affected by the program and changes in the program, such as participants or the surrounding community?
Accuracy: Will the evaluation produce findings that are valid and reliable, given the needs of those who will use the results? Sometimes the standards broaden your exploration of choices. Often, they help reduce the options at each step to a manageable number. Feasibility How much time and effort can be devoted to stakeholder engagement? Propriety To be ethical, which stakeholders need to be consulted, those served by the program or the community in which it operates?
Accuracy How broadly do you need to engage stakeholders to paint an accurate picture of this program? Similarly, there are unlimited ways to gather credible evidence Step 4. Asking these same kinds of questions as you approach evidence gathering will help identify ones what will be most useful, feasible, proper, and accurate for this evaluation at this time. Thus, the CDC Framework approach supports the fundamental insight that there is no such thing as the right program evaluation.
Rather, over the life of a program, any number of evaluations may be appropriate, depending on the situation.
Good evaluation requires a combination of skills that are rarely found in one person. The preferred approach is to choose an evaluation team that includes internal program staff, external stakeholders, and possibly consultants or contractors with evaluation expertise.
An initial step in the formation of a team is to decide who will be responsible for planning and implementing evaluation activities. One program staff person should be selected as the lead evaluator to coordinate program efforts. This person should be responsible for evaluation activities, including planning and budgeting for evaluation, developing program objectives, addressing data collection needs, reporting findings, and working with consultants.
The lead evaluator is ultimately responsible for engaging stakeholders, consultants, and other collaborators who bring the skills and interests needed to plan and conduct the evaluation. Although this staff person should have the skills necessary to competently coordinate evaluation activities, he or she can choose to look elsewhere for technical expertise to design and implement specific tasks.
However, developing in-house evaluation expertise and capacity is a beneficial goal for most public health organizations. The lead evaluator should be willing and able to draw out and reconcile differences in values and standards among stakeholders and to work with knowledgeable stakeholder representatives in designing and conducting the evaluation. Seek additional evaluation expertise in programs within the health department, through external partners e.
You can also use outside consultants as volunteers, advisory panel members, or contractors. External consultants can provide high levels of evaluation expertise from an objective point of view. Important factors to consider when selecting consultants are their level of professional training, experience, and ability to meet your needs. Be sure to check all references carefully before you enter into a contract with any consultant.
To generate discussion around evaluation planning and implementation, several states have formed evaluation advisory panels. Advisory panels typically generate input from local, regional, or national experts otherwise difficult to access.
Such an advisory panel will lend credibility to your efforts and prove useful in cultivating widespread support for evaluation activities. Evaluation team members should clearly define their respective roles. Informal consensus may be enough; others prefer a written agreement that describes who will conduct the evaluation and assigns specific roles and responsibilities to individual team members. Either way, the team must clarify and reach consensus on the:.
This manual is organized by the six steps of the CDC Framework. Each chapter will introduce the key questions to be answered in that step, approaches to answering those questions, and how the four evaluation standards might influence your approach.
In process evaluations, you might examine whether the activities are taking place, who is conducting the activities, who is reached through the activities, and whether sufficient inputs have been allocated or mobilized. Process evaluation is important to help distinguish the causes of poor program performance—was the program a bad idea, or was it a good idea that could not reach the standard for implementation that you set?
In all cases, process evaluations measure whether actual program performance was faithful to the initial plan. Such measurements might include contrasting actual and planned performance along all or some of the following:. When evaluation resources are limited, only the most important issues of implementation can be included.
Our childhood lead poisoning logic model illustrates such potential process issues. Reducing EBLL presumes the house will be cleaned, medical care referrals will be fulfilled, and specialty medical care will be provided.
These are transfers of accountability beyond the program to the housing authority, the parent, and the provider, respectively. For provider training to achieve its outcomes, it may presume completion of a three-session curriculum, which is a dosage issue. Case management results in medical referrals, but it presumes adequate access to specialty medical providers.
And because lead poisoning tends to disproportionately affect children in low-income urban neighborhoods, many program activities presume cultural competence of the caregiving staff. Each of these components might be included in a process evaluation of a childhood lead poisoning prevention program. Outcome evaluations assess progress on the sequence of outcomes the program is to address.
Programs often describe this sequence using terms like short-term, intermediate, and long-term outcomes, or proximal close to the intervention or distal distant from the intervention. Depending on the stage of development of the program and the purpose of the evaluation, outcome evaluations may include any or all of the outcomes in the sequence, including. While process and outcome evaluations are the most common, there are several other types of evaluation questions that are central to a specific program evaluation.
These include the following:. All of these types of evaluation questions relate to part, but not all, of the logic model. Exhibits 3. Implementation evaluations would focus on the inputs, activities, and outputs boxes and not be concerned with performance on outcomes. Effectiveness evaluations would do the opposite—focusing on some or all outcome boxes , but not necessarily on the activities that produced them.
Determining the correct evaluation focus is a case-by-case decision. Purpose refers to the general intent of the evaluation. A clear purpose serves as the basis for the evaluation questions, design, and methods. Some common purposes:. Users are the individuals or organizations that will employ the evaluation findings. The users will likely have been identified during Step 1 in the process of engaging stakeholders. In this step, you need to secure their input into the design of the evaluation and the selection of evaluation questions.
Support from the intended users will increase the likelihood that the evaluation results will be used for program improvement. Many insights on use will have been identified in Step 1. Information collected may have varying uses, which should be described in detail when designing the evaluation.
Some examples of uses of evaluation information:. Of course, the most important stakeholders are those who request or who will use the evaluation results. Nevertheless, in Step 1, you may also have identified stakeholders who, while not using the findings of the current evaluation, have key questions that may need to be addressed in the evaluation to keep them engaged.
For example, a particular stakeholder may always be concerned about costs, disparities, or attribution. If so, you may need to add those questions to your evaluation focus. Three questions provide a reality check on your desired focus:. There are roughly three stages in program development —planning, implementation, and maintenance — that suggest different focuses. In the planning stage, a truly formative evaluation—who is your target, how do you reach them, how much will it cost—may be the most appropriate focus.
An evaluation that included outcomes would make little sense at this stage. Conversely, an evaluation of a program in maintenance stage would need to include some measurement of progress on outcomes, even if it also included measurement of implementation.
Here are some handy rules to decide whether it is time to shift the evaluation focus toward an emphasis on program outcomes:. Some programs are wide-ranging and multifaceted. Others may use only one approach to address a large problem. Simple or superficial programs, while potentially useful, cannot realistically be expected to make significant contributions to distal outcomes of a larger program, even when they are fully operational.
Resources and logistics may influence decisions about evaluation focus. Some outcomes are quicker, easier, and cheaper to measure, while others may not be measurable at all. These facts may tilt the decision about evaluation focus toward some outcomes as opposed to others.
Early identification of inconsistencies between utility and feasibility is an important part of the evaluation focus step. The affordable housing example shows how the desired focus might be constrained by reality.
The elaborated logic model was important in this case. It clarified that, while program staff were focused on production of new houses, important stakeholders like community-based organizations and faith-based donors were committed to more distal outcomes such as changes in life outcomes of families, or on the outcomes of outside investment in the community.
You would likely conclude this is a realistic focus, given the stage of development and the intensity of the program. Questions about outcomes would be premature. It is not clear, without more discussion with the stakeholder, whether research studies to determine causal attribution are also implied. Is this a realistic focus?
At year 5, probably yes. The program is a significant investment in resources and has been in existence for enough time to expect some more distal outcomes to have occurred. Note that in either scenario, you must also consider questions of interest to key stakeholders who are not necessarily intended users of the results of the current evaluation.
Here those would be advocates, concerned that families not be blamed for lead poisoning in their children, and housing authority staff, concerned that amelioration include estimates of costs and identification of less costly methods of lead reduction in homes.
By year 5, these look like reasonable questions to include in the evaluation focus. At year 1, stakeholders might need assurance that you care about their questions, even if you cannot address them yet. These focus criteria identify the components of the logic model to be included in the evaluation focus, i. At this point, you convert the components of your focus into specific questions, i.
Were my activities implemented as planned? Did my intended outcomes occur? Were the outcomes due to my activities as opposed to something else? If the outcomes occurred at some but not all sites, what barriers existed at less successful locations and what factors were related to success? At what cost were my activities implemented and my outcomes achieved? Besides determining the evaluation focus and specific evaluation questions, at this point you also need to determine the appropriate evaluation design.
Of chief interest in choosing the evaluation design is whether you are being asked to monitor progress on outcomes or whether you are also asked to show attribution—that progress on outcomes is related to your program efforts. Attribution questions may more appropriately be viewed as research as opposed to program evaluation, depending on the level of scrutiny with which they are being asked.
Traditional program evaluation typically uses the third type, but all three are presented here because, over the life of the program, traditional evaluation approaches may need to be supplemented with other studies that look more like research.
Experimental designs use random assignment to compare the outcome of an intervention on one or more groups with an equivalent group or groups that did not receive the intervention. For example, you could select a group of similar schools, and then randomly assign some schools to receive a prevention curriculum and other schools to serve as controls. All schools have the same chance of being selected as an intervention or control school.
Random assignment, reduces the chances that the control and intervention schools vary in any way that could influence differences in program outcomes. This allows you to attribute change in outcomes to your program. For example, if the students in the intervention schools delayed onset or risk behavior longer than students in the control schools, you could attribute the success to your program.
However, in community settings it is hard, or sometimes even unethical, to have a true control group.
0コメント