When it comes to thinking about how best to evaluate a policy or program regarding citizen feedback, my head spins with possible measurements and what they really mean. I’m perplexed by this question of What is impact? And how can you quantitatively measure it within a study? – It sends my mind of into the abyss as I stare off blankly into the distance. Part of my scope of work here at UNICEF Uganda is to design a pilot impact evaluation of UReport. Back in June, reading through the original concept note, it made sense on paper and the questions were few. Yet over the seven weeks or so, as I have learned more about what UReport does and as I have interviewed key informants around the city, this simplicity has dissipated and been replaced with total perplexity.
Now this isn’t as complicated as I am making it out to be. There are numerous pieces of literature out there that answer my questions, there are ample courses available on the subject, and there are countless professionals who devote their careers to this. There are clear ways to go about M&E. But the existence of someone else’s knowledge shouldn’t stop your own questioning, right?
So let’s start with a quick summary of the traditional M&E track. M&E in its purest of terms is a process that helps to improve performance and achieve intended results of a policy or program. It strives to increase efficiency and effectiveness of the current management of outputs, outcomes, and impacts. Monitoring and Evaluating are typically lumped together though they do entail different things. Monitoring is the periodical collection of data to track the progress of a program but not as focused on measuring the impact per say. Evaluation differs in the sense that data collection happens during the process of the evaluation and is founded with the purpose of assessing the outcomes or impact.
Within the development conversation, there has been a clear shift to the perceived importance of M&E. These practices are gaining more attention as grand development theories, along with the effectiveness of aid in general, are questioned. M&E has been crowned with importance by the Paris Declaration of Aid Effectiveness in 2005 along with the follow up meeting in Accra. The idea is that more effective evaluation will lead to more effective aid.
But the pressure to evaluate our aid programs for greater effectiveness also creates a pressure to produce results. This need to show results are another reason for the large presence of M&E practices. But when one is using these studies for receive increased government funding, the continuation of funding, or new grants, there is a need to quantify the effects; we provided x amount of supplies to x amount of districts, test scores increased by x amount. Funders like to see these concrete numbers. But is it possible that they don’t tell enough? That they are misleading us? Pushing us to fund certain programs and not others? Bypassing the incentives and interventions which create the real sustainability and development? Are we more interested in showing these tangible results, for our own sake, or are we more interested in building continuity and change?
This debate is ripe within evaluation. The economist Abhijit Banerjee attributes poor development practice to the lack of evidence in decision making. Banerjee in fact calls for greater use of facts and figures, believing randomized trials are the simplest and best ways of assessing the impact of a program. He believes the development field must adopt two practices to bridge this gap: conduct randomized experiments to assess impact and outline concise projects with clear desirable outcomes from the start. He says we need to go back to financing projects and insist that the results be measured. It is the narrowly defined projects, Banerjee holds, that tend to be more easily measurable as well as more successful. Therefore best practices can only be chosen if they have proven to be successful in several randomized evaluations. In other words, it is those programs and policies that can produce quantifiable results and impacts that will survive the bureaucratic policy funding gauntlet.
On the opposite side of the spectrum one finds Andrew Natisio of the Center for Global Development. He argues that inadequate development practices and outcomes are in fact caused by this need to measure, measure, measure. He notes the recent organizational shifts in USAID as the cause for this new attitude to quantify everything as well as the decreased effectiveness of our development programs. He notes five reasons; it creates a tendency to implement only outcome-driven, measurable programs, it places greater emphasis on short-term, service deliverable goals over long-term institution building, it suppresses innovation and creativity within the field, it values compliance workers over technical experts, and it is expensive.
Let’s take a time out, leave Banerjee and Natisio in their corners, and think about an example in education sector for a second. For an organization, it is much easy to quantity the provision of tangible resources. We built a school, we brought textbooks and supplies, we provided new uniforms. These actions are easy to quantify as well as the impact of the action. It is most likely that the programs increased school enrollment. But what impact does it have on the system? To provide a scholarship for 10 years for young girls continuous will not show an impact until say 20 years when the girls go from 5 years old to 25 year old when they are using their education for employment, empowerment. Despite acclaimed success of scholarship programs, less are given today than years ago in the early 1980s. The trend may be attributed to several factors but it is very likely that it may be due to the pressure over the years to produce more outcome measurements. Scholarships make this quite difficult. I think this mentality can also be found at the root of the imbalance between good governance and rule of law development programs and health programs. One can be quantified while the other is met with much more difficulty.
Within this debate discussed above, the monitoring and evaluation method at use is impact evaluation. Impact evaluations are defined by the World Bank as “the systemic identification of the effects – positive or negative, intended or not – on individual households, institutions, and the environment caused by a given development activity such as a program or project.” Easy enough. Essentially impact evaluations want to figure out what the life situation would be like had there been no policy or program intervention. This is typically referred to as the counterfactual. A counterfactual is needed to show that the intervention caused an impact of some degree. It is needed as a baseline to then compare the effect of the program or policy. You want to have these two groups as similar as possible in order to link the change to the intervention.
For my project this summer, the issue focuses on the provision of data in respect to the impact on the decision making process. Arriving at the counterfactual is actually not that difficult as one can look at the current decision making process without the use of U-Report as the counterfactual. The trouble enters when we begin to compare the two groups and search for impact. As we work to design the pilot, we run through this debate of what is impact. Eventually this evaluation will be scaled up and could become a model of evaluation for the 15 other countries that currently use U-Report. Therefore deciding on these outcome indicators is tricky. Check out this blog here for more information on project indicators.
In terms of looking at the impact of citizen feedback data, a couple questions come to mind. What do people consider a successful use of citizen feedback data? How do we note this impact? What does the impact evaluation design have to be to be able to capture this? And of course, most generally, what is impact? Impact, relating to U-Report, can mean several things. The ideas I’ve mentally toyed with are below…
- Informing the citizen population about issues or events in the upcoming future and increasing their participation and involvement within these events
- Civil Society Organizations receive more relevant feedback and input respective to their communities of concern. They then use this to better advocate for their cause and create more appropriate policies.
- Monitoring and Evaluation teams use the citizen feedback data to highlight gaps in service deliveries or inadequacies. During field visits, these issues are focused on and the necessary change occurs.
- Local government (LG) officials (district level and below) incorporate the voice of their people into their planning process and budget plans. This does not mean the central budget or government level above the respective level must approve it and budgets/policy change rather this is a focus on the impact of LLGs advocating for the community needs.
- Central government officials incorporate the voice of their people into their own policy design and resources are allocated based on the community’s highlighted need. This would be a concrete change in the budget from one year to the next or revised/new policies can be seen on the books.
So you can see, U-Report takes on several avenues of impact. The worry of mine is that if we make impact too contingent upon a change in budget and resource allocation, we are missing out on a number of other influential components of U-Report. Within the government system of Uganda, I am learning that the decentralized system sounds wonderful in paper but it is not always executed in practice. The central government still holds much power and local governments struggle with a lack of political and fiscal autonomy in setting their agendas. How practical is it to look for a change/impact where it may not be possible? It is very well probable that district levels may get this information through U-Report, advocate for it, and incorporate it into their own development plan or budget BUT it is not approved by a higher level. It is also possible that funds are not available to fund such initiatives. If the indicator is defined as a tangible change in policy or budget – U-Report could show little impact. And I am not the only who questions this attempt to square a circle. ReBOOT talks in greater detail about the challenge in quantifying the results of open government initiatives.
So you see, how you define impact affects the success of your intervention therefore careful thought should be plenty in the design and evaluation stage. At the start of any policy creation or policy design, a set of indicators should be selected and used to measure the intended impact of the program. But the questions remain. What constitutes impact? And to what extent should we quantify our indicators? In my current interviews around the city I pose this question of “what do you consider a successful impact of citizen feedback data?” Stay tuned for an upcoming post that will comply these responses. Perhaps my ideas won’t be too far off!