The Missing Piece: Reflections on How M&E Helps to Address the Gender Data Gap in Development Programs


What is the role of evidence and data in helping us address gender inequities?  How can monitoring and evaluation (M&E) efforts help women and girls around the world? Join us for a special episode of “Powered by Evidence” for International Women’s Day.  Our guest-host, Jessica Meckler, GEI’s lead on Gender and Inclusion, speaks with two experts on the issue: Talip Kilic, Senior Economist at the World Bank Development Data Group, and Megha Pradhan, Director,  CLEAR South Asia, and Associate Director of Training at J-PAL South Asia. Get ready for a deep dive on topics around data, M&E and gender. 


  • [02:50] Why is bridging the gender data gap important?
  • [09:52] What is women's empowerment and when should we measure this concept when conducting evaluations?
  • [17:20] What are some of the challenges in measuring women's and girls’ empowerment and other gender dimensions, and what can we, as evaluators, do to mitigate these challenges?
  • [37:02] What kind of capacity development support for evaluators is needed to ensure that these gender dimensions are included in evaluation more broadly?
  • [46:33] Examples where successfully integrating gender dimensions into evaluations or research created a change in a policy or program.



Ability to comment coming soon. In the meantime, find us on Twitter or a LinkedIn.




Dugan Fraser

Hello, and welcome to GEI’s first podcast. I'm Dugan Fraser, Program Manager of the GEI. Since we're launching our pilot season of the podcast right around International Women's Day, we decided to focus our first discussion on a very important issue in monitoring and evaluation, the gender data gap. 

At GEI, we believe that gender equity is a fundamental human right and a critical priority for the M&E community. With the COVID-19 pandemic continuing to expose and exacerbate gender inequalities around the world, it's especially important that we keep gender at the forefront of our work right now.

In light of this topic, we thought it best to turn over hosting duties today to my colleague, Jessica Meckler, who's GEI’s strategy officer and the lead on our focus on gender and inclusion. Hi Jessica.

Jessica Meckler

Hi Dugan.

Dugan Fraser

Jessica, tell us a bit more about the conversation that our listeners are going to hear.

Jessica Meckler [00:01:34]

Sure, Dugan. I'm really looking forward to speaking with two fantastic guests today, Talip Kilic and Megha Pradhan. Talip is a senior economist at the World Bank Development Data Group. He was a core team member of the World Development Report 2021: Data for Better Lives.

He's also a team leader for the Living Standards Measurement Study which has been the flagship household program of the World Bank since 1980. Megha Pradhan is the associate director of training at J-PAL South Asia and the Director of CLEAR South Asia, a global initiative for evaluation capacity development coordinated by GEI.

She provides technical and strategic direction to J-PAL South Asia's capacity building partnerships with state and central governments. Talip and Megha have a lot to share on the topic of incorporating gender dimensions into data collection efforts for evaluation, and I'm sure this is going to be a very valuable conversation for our listeners.

Dugan Fraser

Thanks, Jessica. I'm looking forward to hearing the discussion.

Jessica Meckler [00:02:39]

Thanks for joining us today, Megha and Talip. I'm really excited to chat with you, and I really can't wait to hear what you have to share with us. To kick us off, let's set the stage a little bit more broadly. Why is bridging the gender data gap important? Talip, can you share a few thoughts?

Talip Kilic

Thanks Jessica for having me today, and good to join you all. In answering your question, maybe it is useful to set the broader stage a bit for our audience. Despite the evidence showing that empowering women fuels economies in a way that benefits men and women more broadly, we're facing persisting and mutually enforcing gender inequalities in economic, political, and social life across the world.

We know women lag behind labor force participation, full-time employment earnings. Social and legal barriers to women's ownership of assets exist. This lowers their productivity, inhibits their ability to become entrepreneurs, and also forces them into less profitable sectors.

Transformational solutions, we need to tackle the challenges faced by women and girls and men and boys. Ultimately, eliminating gender inequality is only possible if you have the right data to measure these gaps and understand their drivers and also the interrelationship between these gaps.

Solutions to eliminating gender inequality and ultimately raising living standards for everyone, we need good data. We need good data on individuals, on the lives of men and women. One of the key sources of these data that allow us to measure gender gaps are household surveys.

Household surveys typically collect individual level data on men and women, on women and girls. And yet, the survey landscape presents so much room for improvement. So one of the key aspects for household surveys to improve on is, for instance, asset ownership.

Most assets are owned by individuals. But yet, it's typical for household surveys to either not identify owners for key assets or identify owners, but only interview a single most knowledgeable household member.

That often results in seemingly rich information about who owns what within the household, but only comes from the perspective  of one single individual. The research that we've conducted over the last decade as part of the Living Standards Measurement Study program, which I'm a part of, has showcased that proxy respondents provide information that is not in line with individuals on reporting particularly for women.

That resulting data, in turn, biases our understanding of gender gaps in asset ownership and wealth. Proxy respondent crisis actually applies to other topics such as labor where surveys collect information on individuals, but don't always interview the individuals themselves.

In a way, the gender data gap is important because we need rich information on livelihoods of men and women and on differences among them in terms of access and return from economic opportunities which, again, underlie the solutions that we need to design to not only effectively close those gaps but actually raise living standards for everyone.

Jessica Meckler [00:06:06]

That was a really good overview of how surveys and this kind of data is important. Megha, do you want to share a little bit more on the practitioner perspective?

Megha Pradhan

First of all, thank you so much Jessica for having me over. I'm really excited to be doing this. Firstly, what I'd like to do is make this a little bit more concrete and think about what really happens in a world where you have a pervasive gender data gap when you're either not systematically collecting data on half of the world's population, or what you're measuring is not really giving you the true picture.

Because we’ll really be going into the research and measurement side of this, I do want to take the practitioner and policymaker perspective here. How does one design programs and policies that are attempting to change something which is not even being measured or you do not have the data to track it over time?

Then there are several situations where you're designing and implementing a supposedly gender-neutral program, but you take the man as the default. You're not taking into account, for instance, women's productive community managing roles.

You're not really addressing women's practical and strategic needs. The results that you have are instances, for example, where you're introducing new varieties of rice seeds which may be high yielding, but their stalks are too short or they take long to cook. And so they're not taken up by female farmers.

You have well-intentioned programs, for instance, encouraging parental engagement in children's education. But the disproportionate burden of that falls on the mother to a long list of chores she's already doing. These are just a few examples, and we've seen the gender data gap manifest itself across diverse domains ranging from healthcare to public transport to technology.

Secondly, what I want to talk about is really how the COVID-19 pandemic has underscored how this gender data gap - and its consequences - are really playing out on a massive scale. You've seen across the world women and girls are disproportionately affected by the COVID-19 pandemic with the loss of livelihoods, increased burdens of unpaid care work, increase in gender-based violence.

The lack of gender data in this context has really affected our ability to respond to the crisis. Unfortunately, this is also happening at a time where data collection efforts themselves have been disrupted due to the pandemic.

Then the last thing that I want to mention in terms of why bridging the gender data gap is really critical right now is how now with the increasing use of machine learning that uses massive amounts of data that already exists to find patterns and make predictions, there is a concern that gender data gaps are potentially getting embedded in artificial intelligence systems and they're really further amplifying the gender bias and issues of gender inequality.

Jessica Meckler [00:09:18]

Thanks, Megha. I think it's very important to ground these discussions in these real life examples, so I think the perspectives you just brought were really important. I think it's really useful to see how the gender data gap comes into play both at the higher level policy and larger scale data world as well as at the practitioner levels and the idea of designing good programs and policies that address the needs of various peoples.

To continue on, what I'd like to do is start unpacking this idea of women's empowerment which Talip talked about and we're going to focus on a little bit and really just dive into what is women's empowerment and when should we measure this concept when conducting evaluations. Megha, would you like to start us off?

Megha Pradhan

Thanks for that question, Jessica. It's indeed very important for us to first understand and precisely define concepts before we attempt to measure them. This is, of course, especially important when we are looking at constructs like women's empowerment, women's agency - which are often used loosely and interchangeably.

A great definition of empowerment is one put forward by Naila Kabeer who describes empowerment as a process by which those who have been denied the ability to make strategic life choices acquire such an ability.

I also want to talk about a related concept which is that of agency. And we can understand women's agency as their ability to define goals and act on them to make decisions that matter to them and to participate in the economy and public life.

A third concept that's also really important - and related - is gender norms, which essentially refers to informal rules that impose expectations about behavior that are dependent on gender, and these norms could originate from the individual, from family, or from society.

All these constructs that I’ve described are closely linked. Again, Naila Kabeer provides a very useful resources, agency, and achievements framework which emphasizes empowerment as a process and how it is important to measure three things.

One, the resources or preconditions that could enhance women's ability to make choices. This could be their access to material resources, human capital, or social resources. Two, women's agency in terms of their voice, participation, influence in decision-making about strategic life choices.

Three, the achievements or outcomes that could result from increased agency. This could include things like health, education, livelihoods, political participation, et cetera.

Measuring indicators related to resources, agency, and achievements over the course of an evaluation can be a really helpful way to measure the process of empowerment. Of course, underlying all of this are the cultural, social, and political norms, as well as institutions that are operating in a given context and influence every step of the empowerment process.

Now, moving to your question on when should we be measuring empowerment and other gender dimensions. My answer here would be that we should always consider gender in measurement and evaluations even when the program is not targeting one specific gender. I'll mention a few reasons.

I think, first, when you're not considering gender in monitoring and evaluation or research, this may bias what we measure and what data we collect. This is something that Talip already alluded to. Especially in household surveys, a lot of times we are interviewing people who might not be the respondent that actually has the true information.

For instance, typically the adult module that's implemented in household surveys, we’d look at the head of the household which oftentimes, especially in contexts such as in India, it is going to be the male. And so what happens over here is that you're not getting complete as well as accurate representation of what women's views are. This also happens in other kinds of surveys, for instance, if you're looking at leaders, politicians, business owners where women may not be represented as much.

The second reason why you need to have gender dimensions irrespective of what kind of program it is, is that programs often have different impacts on people, according to the gender or other aspects of their identity. There are, I think, lots of interesting examples that have shown this.

For instance, one could see the impact of providing loans or other financial services to SMEs. Evidence has shown that positive impacts were seen on businesses which were owned by men, but not on businesses owned by women. This is because women did not have control over economic resources at home, and their loans got used in their husband's businesses instead of theirs.

Relatedly, I think it's really important to not just think about the gender dimensions in terms of disaggregating our analysis by gender. I think it's very important to also go beyond that and see why you see the results that are there. Why is it that women are showing differential impacts from what we see in men?

Jessica Meckler [00:14:40]

Thanks, Megha. I think you've brought up a lot of interesting points including the idea of needing to define some of these nebulous concepts as well as where this starts to come into play with surveys and measuring data and when you're conducting evaluations, some of the things you need to keep in mind.

Talip, would you like to share some more thoughts from your experiences with these large household surveys and how this fits in?

Talip Kilic

Yeah. Megha covered a lot of ground, particularly on the conceptualization of empowerment. I'd like to second the utility of Naila Kabeer’s framework for our understanding of empowerment and also guiding the topics on which we collect information on household surveys which continue to be enduring innovations that offer us this scientifically-robust, structured approach to understanding people's livelihoods, constraints that they're facing and the choices that they're making.

But in terms of when we should consider measuring women's empowerment, Megha said very key points that I'd like to emphasize which is, first of all, even when the program is seemingly gender-neutral,  you may, in fact, have gendered impacts.

In a way, you need to set yourself up to be able to capture these gender impacts whether or not actually the program has an explicit focus on gender gaps as we've been trying to make the point that these gender inequities are really underlying much of the development outcomes that we're trying to achieve at the individual level.

As evaluators, we need to have the tools in place to be able to look at gender impacts. And that means good data on individuals that are self-reported about their own lives and experiences as opposed to hearing these human stories from the perspectives of others or hearing these stories in the context of, say, operations and in household surveys that are not particularly designed in a gender-sensitive way.  I'll give a few examples of that later on in the podcast.

Jessica Meckler [00:16:59]

I think you emphasized some of the most important points. What this has nicely done is set out why the gender data gap is important, when we need to start thinking about this in evaluation, which is very overarching. And so, with that framework and that basis, I'd like to dive into some of the challenges that this poses.

What are some of the challenges in measuring women's and girls’ empowerment and other gender dimensions, and what can we as evaluators do to mitigate these challenges? Talip, would you like to share a few thoughts on that?

Talip Kilic

Sure. Maybe I'll come at this from a perspective of large-scale household surveys which, again, form the basis of representative individual level data that we'd like to collect on beneficiaries as well as non-beneficiaries of the programs of both men and women - boys and girls - to be able to look at these gendered impacts.

In the household survey world, irrespective of whether these are official household surveys that are conducted by national statistical offices or smaller-scale surveys that are conducted specifically for evaluations of different programs and interventions, I think we have a long way to go to make sure that these household surveys are designed and implemented in what we call a gender-sensitive way.

Because if not, again, this has resulting impacts on data quality and on whether we understand these gaps and their drivers in the correct way. I'll give a few examples. In the context of household surveys, I think incorporating gender perspectives into the design process from the get-go is critical just like the way we mentioned this for evaluation purposes.

In the household survey world, for instance, there's quite a bit of work we still have to do in recruiting our enumerators in a gender-balanced way. In making sure that when we are eliciting information from human subjects that are sensitive in nature, gender-specific dimensions of that information collection are taken into account.

In specific cases, we do need to think about gender matching between respondents and enumerators to ensure the intended information flow and to establish that rapport with the respondent, again, to be able to capture as accurately, as possible, the information that we need.

Many of the topics that we introduce in household surveys, again, to better understand gender gaps in asset ownership and in work and employment outcomes. Different norms and belief systems that our enumerators or the survey managers subscribe to that are not always evident at the surface.

When we're trying to collect data on these topics that are, say, nontraditional in certain settings that we're trying to introduce, unpacking people's understanding of these concepts and actually training them on why it's important to collect data on such topics is also important.

Training in a way is important to motivate the importance of this work to people that are not accustomed to thinking about it or thinking about it in a particular way where we're trying to expand the perspectives that they should keep in mind when engaging in human subjects such that certain types of behavior or certain types of priming when they're asking the questions - when they're conducting the interview - are absent from the process.

I would say a few other things, Jessica, since you asked. Many of the topics that, again, we need data on  are supposed to be investigated at the individual level. So I made this point before where we ought to be interviewing individuals asking about their life experiences from their own perspectives. I think surveys, we see, are quite heterogeneous in terms of the extent to which they allow for proxy respondents.

Proxy respondents are convenient because they are a cost-saving measure. When you're conducting household surveys, you could be collecting seemingly individual level data, but reported from a perspective of one person or a few people in the household where you don't go that extra distance to try and schedule the interview with someone that may not be available at the time you tried the first time to conduct that interview in that household.

I would say in the large-scale household surveys conducted by national statistical offices in low-income settings, there is not sufficient attention being paid to this particular area.

I would say that in the last particularly five years as part of the Living Standards Measurement Study, we really have shone a light on the importance of this particular topic in large-scale household surveys where we have supported national statistical offices in scaling up best practices in conducting personal interviews in private settings and making sure that we minimize the use of proxy respondents.

The last set of issues that I would highlight that speak to the challenges of collecting gender data,  they're actually related to the absence of validated cost-effective survey tools that we need to be able to collect information.

There are many areas where it's not obvious how you should ask the question. It's not obvious how you should design the questionnaire. It's not obvious how you should conduct the interview. I think agency as a broad area, say, going back to Naila Kabeer's work, also has several dimensions of agency that we'd like to capture.

Many of them are around, say, perceptions, self-reported data that are quite sensitive to the way that you ask questions or the order in which you ask questions or the order in which groups of questions are asked in the context of an interview - whether you give examples or not in the context of a question to explain topics, how that impacts data quality.

There are intricacies associated with the development of these tools that we would like to rely on to get the data that we need. Yet, that research and development need continues to persist in some areas that are fundamental to measuring empowerment.

Another good example would be time use. If you think about our understanding of gender differences in paid work versus unpaid care and domestic work, we rely on time use data collection where individuals report their time allocations to specific activities.

Time use modules, time use surveys, they predominantly rely on 24-hour recall. From your individual perspective, maybe we're not the typical respondent of a household survey a low-income country.

But 24-hour recall could be challenging for educated, non-educated respondents for very different reasons. Yet, the reliance on 24-hour recall as a convenient tool, as a cost-effective tool continues to persist. We don't actually know whether this is a validated way of capturing time use.

The road to good data requires a lot of planning, a lot of experimentation, a lot of investments in public goods in research and development that is often underprovided. I think that is also one aspect of constraints that we need to collectively solve.

Not only investments in data but investments in the foundations of this data-generation process that allows us to better understand gender gaps and ultimately devise programs to address them and also evaluate those programs to look at the gendered impacts.

Jessica Meckler [00:25:39]

I absolutely love the term road to good data because I think what you've just described is so many of the challenges that come up across the entire pathway of planning for, collecting, and then looking at data to understand high quality data that gets at these individual lived experiences and responses and how difficult that is.

I think that was a really great way of looking at it across the implementation as well as the methodological aspects. So I think that was really useful. Thanks for walking us through all those points. Megha, I'd like to give you some time as well to share your experiences and your thoughts on the challenges and ways to mitigate these.

Megha Pradhan

Issues can come in at different points in the process of measurement starting with the construct itself and choosing the indicators to getting responses. We've seen how sometimes gathering even very simple data such as household size can be quite complex, and this becomes even more difficult when you're designing a measurement strategy for gender outcomes.

Of course, the good news is that there are some simple and practical tips as well as creative measurement techniques that can help mitigate these challenges.

The first challenge that I’d like to talk about is the fact that empowerment as a concept is complex and nebulous. It's beyond just measuring access to resources or changes of well-being. It is about meaningful choice a lot of times.

We usually only observe the outcomes of choices and not the real decision-making process itself which means that a lot of times it is hard to know whether changes in well-being that we see are indeed the result of women's increased ability to make choices.

As an example, in an evaluation in rural Bangladesh, researchers found that more married women took on income-generating activities as a result of an empowerment program which initially appeared to be an empowering labor market choice.

Yet, in qualitative interviews that were conducted with these women, researchers found that some of them were actually working out of severe economic necessity and many had limited autonomy choosing what income-generating activity they were participating in.

There are issues like these which come up, and there are several ways researchers can try to tackle this challenge. One direct way to measure choice is simply to ask people about the decision-making process itself.

One could ask men and women general questions about who makes decisions in the household. For example, who usually makes decisions about, say, major household purchases or visits to the family or healthcare.

You could also ask more specific questions about a very concrete scenario which is tailored to the choices of women that they care about in the study context. This is oftentimes easier for them to answer accurately.

The second way one could do this is to measure women's and men's preferences at baseline and track whether these outcomes move in the direction of women's preferences after the program. For instance, in one study on women policymakers in India, researchers found that women's preferences for investments in local infrastructure were very different from men's.

They tracked new infrastructure projects in communities to see whether they moved to be more in line with women's preferences. And so this approach was very useful because you could explicitly then measure what women want which is a key part of the definition of empowerment.

Then the last approach I want to talk about is how you can observe women and men making choices directly. And this could be in a real world setting or could also be through a game or some structured community activity. This again is quite helpful because you’re able to observe actions directly. You do not have to rely on people’s reports of how they participated in decisions. For example, we could count the number of times women speak up in community meetings and see if it increases after a particular program has been conducted.

Another example from a study is where researchers created proxy measures of women's bargaining power in a marriage. They saw this by offering women a choice of getting either a slightly small cash transfer delivered directly to them or getting a slightly large cash transfer delivered to their husband. There are interesting techniques like this where you can actually observe their choices directly.

The second challenge specific to women empowerment is that empowerment means different things in different contexts. But we may also want to compare across different contexts. The issue here is that we need to strike a balance between using locally tailored indicators but not customizing some of the standardized measures that already exist. So much that it doesn't really lend itself to meaningful intercountry or interregional comparisons that you may want to make.

Here again there are different strategies and tips that one can employ. The first of course is using findings from formative research that you do at the beginning to select or develop locally tailored indicators and questions.

As an example, the Bangladesh study that I was alluding to before. Over there researchers instead of using a more general question on broader decision-making-for instance about saying who usually makes decisions about healthcare for yourself and your family-they actually tailored it to the local context and came up with very specific questions.

For instance, they ask questions such as: If your child is sick and needs immediate healthcare but your husband is not at home, what would you do? If you ever needed medicine who would purchase it? Do you think you can go and buy it yourself? In this way they were able to design questions which were really tailored to the local context and gave much more nuanced responses in that context.

Then of course after you've developed some locally tailored questions, you also want to pilot them in the field before launching the survey to validate them. You really want to make sure that whether you're asking the right questions. Whether the questions make sense to the people that you're interviewing.

There's a really interesting example from a study in Peru where a surveyor is interviewing a woman about decision making. The surveyor asks the woman, in your house if you have to purchase a cow, is it your husband who decides when to buy the cow? Or do you decide? The woman responds, we both decide.

Then the surveyor persists and says, okay so both decide. But how do you decide? You talk about it and get into an agreement? The woman responds, oh it almost always works that way. He comes and tells me let’s go to the village fair to buy the cow. I tell him fine, let’s go.

I think this is just really interesting example that highlights how while the woman is interpreting a decision-making process as joint because it involves some sort of a conversation with her husband. In fact, it appears that the husband is the primary decision maker.

I think there are a lot of interesting insights that you can get from piloting your survey. Then of course I think the broad tip would be that it's really useful to complement context specific indicators of measurement with more standardized ones.

Then the last challenge that I want to highlight is how when it comes to empowerment, many aspects are susceptible to reporting bias. For example, people may not answer truthfully about sensitive topics like gender attitudes, reproductive health, contraception use, violence etc. There is a high likelihood of social desirability bias where respondents give answers that they think the survey wants to hear. Or are in line with the generally accepted social norms.

Oftentimes one also sees that respondents stated preferences may be different from their reveal preferences. Meaning that people say one thing but do another. Here as well, there are a lot of interesting solutions that one can implement in the field.

For instance, when possible, it’s really advisable to complement indicators that are subject to reporting bias with more objective proxy indicators. As an example something like incidents of unprotected sex which is a very sensitive question to answer. Instead of that, you could use incidents of childbearing among young women.

Another strategy is to really try and triangulate an outcome using multiple indicators of perspectives when we don't have the ideal measure. I think the last strategy is to frame questions in a creative way. For instance, you could indirectly ask about certain situations, coming up with a hypothetical example.  You could also use non-survey instruments such as games, experimental vignettes, implicit association tests to mitigate reporting bias.

I think there are also interesting examples that are out there. For instance in an evaluation for female leaders in village councils in India, researchers played a short recording of a speech by local leader. They were responding to complaints from a villager. Respondents were randomly assigned to hear the same recording spoken by a man or a woman. After the speech was over, they were asked to rate the leader's performance and effectiveness.

Something like this really allowed researchers to measure whether there was some sort of subconscious bias that led people to rate female leaders as relatively less effective. Lots of interesting strategies that can be employed to address some of the challenges that I mentioned.

Jessica [00:36:42]

I think what we've seen from this last little bit of conversation is that, while there are a lot of challenges, there's also a lot of innovative solutions that evaluators and researchers have already been working with.

What I want to do is actually circle back to something Talip, that you had mentioned a little earlier. About the importance of training and ask you both, what kind of capacity development support for evaluators is needed to ensure that these gender dimensions are included in evaluation more broadly? Because as we just heard there are a lot of challenges, a lot of potential solutions. But there’s a lot of thought that needs to go into this.

It'd be great to get your thoughts on this kind of capacity development support and what could be useful. Megha would you like to share a few thoughts first?

[Question #4]

Megha Pradhan

Sure. Thanks Jessica. I think the first thing that I'd like to say is that to improve the inclusion of gender dimensions in evaluations and broadly research, I think it's very important to work with  broader set of stakeholders. Not just evaluators but also implementers and commissioners of evaluations. I think this requires a combination of advocacy, awareness building as well as capacity building activities.

I'd like to talk about some of the small steps that we've taken in this direction. The CLEAR South Asia Center is hosted by J-PAL South Asia. J-PAL had developed this fantastic research resource on measuring women's and girls’ empowerment in impact evaluations that collated insights from the experience of researchers around the world.

As part of some of the dissemination that we were doing. We saw that there was a strong interest from diverse stakeholders across the gender space who wanted more detail on strategies to measure gender outcomes. In response to this interest, we conducted a five-day online workshop on applying a gender lens to program evaluation in this is in 2021.

This actually targeted practitioners. These were from NGOs from development organizations and foundations who were working in the gender sector in South Asia. 

As part of this workshop because this was not targeted at researchers but at practitioners, we started with the building blocks of monitoring and evaluation. We looked at concepts of theory of change, measurement and impact evaluation as I mentioned, to really create awareness also around the need for these tools.

Then we moved into more advanced explorations of the constructs of interests such as agency, empowerment and norms. I think overall the participants found this really exciting. They really learned how to develop accurate and precise measurement tools for these hard to measure concepts. I think broadly, we've seen that there is a great appetite in the community of stakeholders who work on gender issues.

As I said, this doesn't have to be researchers but really implementing organizations who are designing programs focusing on women. I think there's a lot of interest from them to build capacity on this topic.    

The second part which we get to start on which I think is extremely important is embedding gender dimensions in government monitoring and evaluation systems. As you know, a large part of the work that we do at CLEAR South Asia is working with governments and helping them build better aim and assistance.

I think moving forward as we work with governments at the state and the central level on developing monitoring and evaluation policies and frameworks, we really want to make sure that there is a strong focus on gender in a very deliberate and intentional manner.

For instance, ensuring that it's featured in evaluation guidelines. It's part of the terms of reference when the government is commissioning evaluation studies et cetera. This is of course I think at this stage, it's a little aspirational. It is going to take time. But I think it's very crucial for institutionalizing many practices that take gender into consideration.

Jessica Meckler [00:41:12]

What's really useful about what you've just shared with us is the importance of bringing this capacity to development across multiple different sectors and stakeholders like you said. Not just on to evaluation professionals and researchers but also practitioners who also participate and conduct a lot of monitoring and evaluation activities. I think that is a really nice way of framing it. I really appreciate that.

Talip would you like to share some of your experiences or your thoughts on supporting the capacity development aspect?

[Question #5]

Talip Kilic

Thanks Jessica. I think one of the central points that we've been trying to make today is that not every data set that we lay our hands on to understand gender gaps is created equal, right? If the right data are not collected or biased, the resulting policies that use these data for targeting or for evaluation may be flawed.

I think from the capacity development perspective evaluators, one of the things to put in place is to make sure that they recognize this fact. I think providing evidence that speak to the importance of collecting household surveys data in a gender sensitive way, paying attention to our respondent selection protocols, our field implementation protocols to get that self-reported information using international best practices.

The importance of doing that for the resulting data and for the resulting analysis is in general set of ideas that the evaluators need to subscribe to, to the extent that these nuances are not apparent. They have to be fundamental ideas that are reinforced in capacity development of evaluators.

Circling back to some of the some of the ideas that I mentioned earlier on. I think having in place very clear practical guidelines and validated survey tools for practitioners that are open access, publicly accessible and user friendly is critical, I think, for operational staff or evaluators to rely on to adapt and incorporate into their evaluation work program. These resources are again public goods. They are under-invested. As such research and development efforts that make available these guidelines and validated survey tools that are anchored in peer-reviewed research should be supported.

But you know the story doesn't end there, right? You can have your guidelines. You can have your open access tools. Without that close engagement with the evaluators as Megha was also alluding to through several examples, it's not clear that they will be adopted or that they will be implemented in the right way.

I think that hands-on technical assistance in every stage of the evaluation process is something that we often see to be needed in low and middle income countries on the adoption of these methods; in a way that also appreciates the gender perspectives and the nuances that we're trying to bring in to this data collection and the importance of basically challenging some of the narratives.

The myths and the beliefs of gender relationships right, that may be inherent say in the human subjects that we rely on to collect the data from our sampled respondents. I think that hands on technical assistance in the adoption and the implementation of these of these tools on the part of the evaluators is also something that is needed to ensure that again gender dimensions are included in the evaluation more broadly.

Jessica Meckler [00:46:00]

Thanks Talip. What I really like about what we've been discussing is how important this is. What some of the challenges are. How we can continue to not only think about building our own skills as evaluators but also some things that the field can do as a whole to help support the incorporation of measuring gender and closing the gender data gap overall. I think we've gotten a really rich discussion.

As we start to get to the end of our time together. What I'd like to do is turn a little bit more to a storytelling mode and ask you to share an example that you've seen where successfully integrating gender dimensions into evaluations or research created a change in a policy or program, to really emphasize how this can be used in practice.

Talip, would you like to share an example from your end?

[Question #6]

Talip Kilic

Yeah Jessica. Thanks a lot. Maybe I'll again, I'll give an example from the household survey world. In terms of influencing programs of household surveys that basically start to collect better gender data as a result of long-term engagement and statistical capacity building on the  part of our clients which for us as part of the Living Standards Measurements Study are our national statistical offices.

There are several examples where incorporating gender dimensions into large-scale household surveys have basically dramatically altered our understanding of specific types of gender gaps. This evidence was also a key part of the momentum that we've been able to build over the last particularly five years in better advocating for better methods for collecting individual disaggregated data on economic opportunities and welfare.

A good example again from the living standards measurement study program is the work that we did on understanding gender productivity differences in agriculture with a focus on Africa. Actually we have made the decision to collect plot-level agricultural survey data in large-scale household surveys, nationally representative surveys that identified the managers of these plots within the household and also gave us very precise measures of agricultural productivity materializing on these plots.

The available evidence on gender productivity gaps in much of Africa has been based on small-scale non-representative studies. Collecting that data say as part of several countries that have been supported by the Living Standards Measurement Study integrated surveys on agriculture program. We’ve shown that the estimated gap ranges from 13 % to 25%  in Africa estimated at the national level.

The same data in turn propelled research that has showcased closing of the gender gap in agricultural productivity would lift up to a quarter of a million of people out of poverty just in Malawi alone.

I think these figures are these headline figures showcasing that you can collect this data in large-scale household surveys in a way that refreshes our understanding of these gaps, has been critical for building this momentum in getting to better gender data bit by bit.

We've seen that in the last five years we've been able to scale up basically preferred methods of  international best practices on individual dissaggregated survey data collection on asset ownership on work and employment, piggybacking on this evidence on better gender data facilitating better analysis and better understanding of realities on the ground.

As part of the Living Standards Measurement Study - Plus program, we've been supporting national statistical offices in doing things that many people have told us you know five years ago that can't be done.

These are seemingly simple things where we are interviewing individuals in private and asking about their lives and living conditions so that they could report it themselves as opposed to other people. I mean these are simple ideas. But they have again as we have repeatedly said fundamental implications on whether we get these stories right.

Getting these stories right versus wrong makes all the difference for policy decisions that we would like to promote for inclusive growth. In a way we want to stay committed to a culture that collects better gender data in household surveys. In LSMS we say that every data point has a human story. We want to know the genuine stories of our respondents and our storytellers and capture their voices as opposed to proxies.

Through programs like the LSMS where we have development and validation of survey tools feeding into large-scale survey operations, we can contribute to closing gender gaps but also provide tools that are in turn valuable for survey practitioners elsewhere. That can then rely on these tools and adapt it to their settings and collect the information that they need.

Jessica Meckler [00:52:53]

I love the idea and how you just framed. It's not just about data but it's about the stories. Getting the real-life stories from the people who've lived them. That's what feeds into this data and gives it the human face. I really love how that example and the work that you've just talked about really shows that. I think that’s a beautiful way of talking about data and why it's important, especially in the subject of gender. Thank you for that.

Megha, would you like to share an example from your work?

[Question #7]

Megha Pradhan

Thanks Jessica. I think we've already talked about quite a few examples of gender neutral programs where looking at gender dimensions threw up some surprising results. I thought I’ll use this opportunity to actually talk about a gender transformative program which is being adopted as a policy in India.

This is a school-based gender attitude change program which was developed by an Indian NGO called Breakthrough that was delivered to adults and children in common schools in the state of Haryana. A state that actually has the most male skewed sex ratio amongst all Indian states.

I want to talk about this example I think for a few reasons. One, I think the program was quite unique. In that unlike standard empowerment programs that focus on providing life skills like negotiation, training, etc. to adults and girls - this had both adults and boys and girls participate in classroom discussions about gender equality. Things like who's working outside of the house. Who's responsible for household chores, etc.

Secondly this program looked at a range of really interesting outcomes. It looked at how this two-and-a-half-year program changed gender attitudes, aspirations as well as behavior of adults and children. Here as you can imagine, there was a lot of interesting measurement work that was involved in terms of eliciting responses from the students.

There were innovative techniques like the implicit association test that was used in addition to self-reported measures. Then looking at how students were actually behaving apart from again self-reported behavior.

For instance to look at medium of outcomes they actually had students being asked to sign a petition against dowry. They wanted to see really has their reported behavior actually translated into actual behavior? Lot of really interesting measurement work that was done as part of this study.

I think Third. Even the results I think are very interesting. I mean of course in the positive note. We saw that students expressed more progressive gender attitudes. But there was a nuanced result when it came to behavior. We saw that behavior change amongst boys was actually much higher than for girls.

For instance, boys reported doing more chores but girls did not report doing fewer chores. The idea was that girls perhaps continue to face greater external constraints when it came to enacting the change.

In terms of the policy change I think the fantastic thing is that this program has actually been adopted by the government of Punjab. It is being implemented in government schools throughout the state. It's actually being delivered by government school teachers.

I think it's a great example of a program that is trying to not just look, consider gender but also really change really sticky gender norms. I just want to add I think one thing since we are towards the end of the podcast.

Talip spoke a lot about the work that needs to be done on providing training and capacity building. I also want to add that in addition I think just the idea that gender data gender statistics all of this really also involves a cultural change. Right? It's not just something which needs to be seen as  nice to have. It's at the bottom of the document. That have you done a gender disaggregate analysis.

But I think what is would be really fantastic I think that's the change that we're hoping to see. That it's really mainstreamed in a way that it's part and parcel of the entire process. It's something that people want to do rather than something that they have to do as a tick mark. I think really thinking about it as a cultural change. I think it's really important.

Jessica Meckler

Absolutely. I think that's a  great place to end our conversation. I wish it could have gone on much longer. But I want to thank you both so much for joining us today and sharing both the great work that has been done and also the work that you're currently doing.  I think those examples really summed up.

Megha your last point really summed up how important this is. How far we've come and how far we continue to want to go to really build that culture of having gender incorporated into evaluation and research work. Having you both today sharing your perspectives was at an absolute treat for me. I look forward to hearing more about what you continue to do in this area in the future. Once again, Megha, Talip, thank you so much for joining us and sharing all of your insights.