AEA Conference 2023 TIG Sessions

TIG Annual Meeting

Day/Time: Tuesday October 19, 2023 7:00 PM – 8:00 PM ET

Agenda

Learn about volunteer opportunities
Meet like-minded people
Hear the TIG Year in Review

Storytelling as game changer - From dry data to impactful advocacy using Agile project management approach (PD Workshop)

Tuesday 9:00 AM – 4:00 PM ET
Presenters: Marcel Chiranov, Roxana Chiranov

Based on three case studies, we will train the audience on how to implement the Agile approach, by following a step by step process, and by using an automated online data management system, which we have previously used to obtain the relevant impact data that can be used in tailored success stories. Following a well-targeted impact-based advocacy campaign, the program partners received a 25% budget increase from their legal funder, while 82% of them reported an “increased improvement” in their cooperation stakeholders over the program’s duration. This appears to be rather successful storytelling!

Learning Objectives:
The participants will learn different approaches to measure project’s impact.
The participants will be able to use their impact data in innovative ways to create success stories.
The participants will understand how to plan and implement agile project management techniques in volatile environment using technology.

Determining Reach, Inclusion, Effectiveness in Humanitarian Contexts - Using Data Visualization and Automated Data Systems to Adapt Humanitarian Programs to Meet the Needs of People Affected by Crisis

Day/Time: Wednesday 4:15 PM – 5:15 PM ET
Presenters: Alex Tran, Amy Joce, Johnstone Lowoton, Martin Peter, Tatiana Klymenko
Room: White River Ballroom D

Join us as Mercy Corps humanitarian MEL practitioners present 3 use cases of this automated process and the how we were able to use it to evaluate programs in relative real-time, tell different narratives/stories that go well beyond basic reporting, and drive forward tangible change in programs to be more equitable, inclusive, and ask major questions of our humanitarian response approaches.

TIG Poster Session

Day/Time: Wednesday 5:30 PM – 7:00 PM ET
Room: Griffin Hall

85 - Spreadsheets, but helpful!

Presenter: Doris Espelien

We know two things: A report is only as good as the data you feed it, and organized data makes our work easier. However, spreadsheets can be both tedious and overwhelming - unless you know how to make them work for you! This workshop will help you improve your Excel skills as you walk through an exercise using real data. Using data management principles, you will learn about planning your spreadsheet layout based on what you actually want to know, how to use basic formulas and formatting to make data cleaning (and life) easier, and get a brief introduction to pivot tables and Excel's visualizations. Excel truly provides a host of ways to manage your data and this workshop can be your springboard to putting this tool to effective use!

86 - Using Video and other 'Social Media Style' formats in lieu of weekly journal reflections in a 12 week Fellowship Program for young adults

Presenters: Theresa C. Fox, Jamé McCray

Journaling has been around since the beginning of recorded history. Over time, this reflective practice has long been used to help consolidate and deepen learning in students, particularly in the field of education. Where reflective practice is embedded in environmental education programs, it can be an effective tool, promoting personal growth of the teaching profession in a field where engaging constituents can be tricky. However, there are also barriers to engaging in reflective practice. A main barrier often cited is making the time to incorporate reflection. When compounded with lack of experience in engaging in reflective practice, it can be ignored entirely. The Alliance for Watershed Education has embedded reflective practice into its fellowship program with a two-fold intention. First, to promote reflective practice and second, as an opportunity to collect process data about the larger program. Over time, the program used multiple methods to try and improve reflection rates with some success, however, weekly participation rates suggest that the process is still less than desirable for some fellowship participants. One methodology employed to encourage and enhance the reflective process will incorporate the option to use of social media to make reflection more appealing and become a more planful event in lieu of responding to a weekly reflection prompt. Other efforts have included additional prompts from supervisors and the program manager to complete these activities when they are past due, with limited success. It is hypothesized that this change will provide a less restrictive, more creative outlet for reflection, while still ensuring the opportunity. This study is an exploratory research project on how changing the reflection methodology impacts the quantity of reflections; the timeliness of reflections, the qualitative information gained from reflection as compared to utilizing essay prompts and, most importantly, the overall reflective experience. The larger research question is to what extent does the use of social media as a method of reflective practice impact the fellow who is engaging in professional learning? Data will be presented to elucidate the findings.

How can AI Language Models help evaluators tell better stories?

Day/Time: Thursday 10:15 AM – 11:15 AM ET
Chair: Stephanie Coker
Discussants: Linda Raftree, Kerry Bruce; Presenters: Paul Jasper
Room: 104

The session will posit that AI Language Models bring incredible opportunities to scale qualitative analysis to tell powerful and representative stories of social change, but that biases, opacity of coding, and rules orienting AI software and tools make them imperfect at best and dangerous at worst. Before moving into breakout groups, discussants will provide a brief overview of the basics of the newest AI language models, how they work, and offer advice and caution to evaluators on applying these models to their work. Discussants will orient session participants on popular Large Language Model (LLM) tools (e.g. ChatGPT, Bard, LLaMa) and relevant applications (e.g. sentiment analysis, theme identification and aggregation in qualitative data) of AI Language Models in the field of monitoring, evaluation, and learning. After brief remarks from the discussants, we will draw on attendees’ insights and experiences to reflect on the risks and opportunities for applying AI language models to evaluation practice and evidence-informed storytelling. Breakout groups will cover key questions like ‘what are the best uses for LLMs in evaluation?’, ‘what are the key challenges with LLM for evaluation?’, and ‘how can evaluators address ethical issues of LLMs?’. Groups will share back key insights in plenary before hearing final remarks from the discussants. Much of the session will borrow from the early lessons and explorations of a cross-regional, multidisciplinary, 200+ member-strong Community of Practice working to understand the needs, opportunities, and risks of emerging types of machine learning and AI language models in evaluation practice.

Using ChatGPT to Enhance Program Evaluation's Storytelling

Day/Time: Thursday 10:15 AM – 11:15 AM ET
Presenters: Crystal Luce, Antonio Olmos, Rachel Carlson
Room: Grand Ballroom 9

OpenAI such as ChatGPT is starting to make its mark on the world around us. Stories of OpenAI have been in the news, and shows such as “Last Week Tonight with John Oliver” and “South Park.” Questions remain for its use in fields such as program evaluation and its ability to tell an accurate story. The process of program evaluation involves collecting, analyzing, and summarizing the data to tell an effective story and make decisions based on the insights gained. ChatGPT, a language model developed by OpenAI, offers a range of features that can assist program evaluators in collecting, analyzing, and reporting data. The possibilities of OpenAI seem limitless. The potential areas of use include: help with statistical analysis (what test may be appropriate when, reminders of assumptions, and help with completing the analysis), help with writing code for programs (such as Python, SQL, and R), help with literature reviews (finding articles and summarizing them), help with finding and creating questions (tests, surveys, interviews, and focus groups), help with qualitative analysis (sentiment analysis and themes), and help with data summarizing (qualitative and quantitative). In this demonstration, we wish to highlight some of the potential uses of ChatGPT for evaluators and discuss some of the potential consequences. Tools such as ChatGPT can be useful for program evaluators and our storytelling abilities; the applications are infinite. However, left unchecked, this technology can have lasting negative ramifications, and we must be aware of the potential harm that could be created. In this demonstration, we hope to show both the benefits and account for the costs of such technology.

Introducing new information technologies to development partners: Stories from the field

Day/Time: Thursday 2:30 PM – 3:30 PM ET
Chair: Kecia Bertermann
Presenters: Kerry Bruce, Swapil Shekhar, Michael Bamberger
Room: 201

The panel will review experiences with the process of assisting development partners to implement new information technologies. Most AEA presentations on information technology have tended to focus on the methodology and the benefits of the new kinds of data collection and analysis, but have given relatively little attention to the challenges during the process of implementation. The process of introducing new technology can be quite disruptive (in both positive and negative ways) and the process can sometimes affect the originally intended outcomes once the technology is operating. Some new technologies can also introduce unanticipated biases due to issues of data quality and sample bias, or the socio-cultural and political orientations of the different groups involved. Three presenters will tell stories about their experiences with the implementation of different kinds of information technologies and will reflect on the lessons learned. It is hoped that the presentations and discussion will provide insights and reflections for the many agencies that are involved with supporting local partners with the transition to potentially beneficial new information technologies. This session is a collaboration between the International and Cross-Cultural (ICCE) and Integrating Technology into Evaluation (ITE) TIGs. The presentation on machine learning was submitted by the ICCE and the proposals on experiences strengthening the Indian government’s use of survey and administrative data, and using big data to strengthen humanitarian programs were submitted by the ITE.

Sources and consequences of bias in big data-based evaluations: Stories from the field (Presidential Strand)

Day/Time: Thursday 5:00 PM – 6:00 PM ET
Chair: Michael Bamberger
Presenters: Pete York, Linda Raftree, Randal Pinkett
Room: Grand Ballroom 7

The panel examines the causes and consequences of bias when evaluations are based on big data and data analytics. The session will begin with the presentation of a framework for identifying the main types of biases in big data. Three presenters will then tell stories about (a) the use of big data in different evaluation contexts: (b) using a data-driven Diversity, Equity and Inclusion (DEI) approach to navigating the bias terrain and (c) MERL Tech’s Natural Language Processing Community of Practice (a group of academics, data scientists, and evaluators from NGOs, UN and bilateral organizations). Each presentation will describe the benefits of using big data, the kinds of bias that can be introduced, and the consequences of the bias in terms of under-representation of vulnerable groups, as well as the operational and methodological consequences. The focus on bias is important, because some advocates claim that big data is more “objective” than conventional evaluation methods because sources of human bias are excluded. While big data and data analytics are powerful tools, the presentations will show that this claim of greater objectivity is overstated. Following the presentations, the panelists will discuss common themes and issues across sectors and ways to address each of the sources of bias.

Emerging AI in Evaluation: Implications and Opportunities for Practice & Business

Day/Time: Thursday 10:15 AM – 11:15 AM ET
Chair: Bianca Montrosse-Moorhead; Discussant: Sarah Mason
Presenters: Nina Sabarre, Kathleen Doll, Sahiti Bhaskara, Linda Raftree, Aileen Reid, Zach Tilton
Room: 103

Recent groundbreaking advances in artificial intelligence (AI), namely the November 2022 launch of ChatGPT, have ushered in a renewed wave of intrigue and discourse surrounding the individual, collective, and global implications of AI. To date, evaluations of and with artificial intelligence have largely been underdeveloped and under-explored. In this session, four sets of panelists will share their use of AI in their evaluation practice, offer implications of artificial intelligence in conducting and the teaching of evaluation, and invite discussion on questions such as: What role should artificial intelligence play in evaluation? What does it make possible for smaller evaluation firms? What practical, ethical, methodological, and philosophical challenges does emerging AI pose? Which parts of evaluation are inherently human? How much will evaluators and evaluation users trust the products of AI-generated or AI-assisted work? How do we train emerging evaluators to work in a world in which AI is prevalent?

Integrating technology into evaluation (TIG Multi-paper Session)

Day/Time: Friday 2:30 PM – 3:30 PM ET
Chair: John Baek
Room: Grand Ballroom 2

Machine Learning algorithms for predicting university student dropout

Presenters: Elvira Celardi, Antonio Picone

The Italian university system is characterized by a significant level of student dropout. This has adverse effects on several fronts: on the socio-economic side of the country, due to the lack of return on investments made in skills growth; on the university institutions, in terms of reduced national funding and revenue; and, finally, on students, who have to rethink and reorient their life paths. It is, therefore, important for universities to be able to make predictions about the outcomes of their students' academic careers. Indeed, knowing whether and how many students are at risk of not completing their studies allows the university to intervene with support and guidance, helping those students to continue and finish their studies on time. In recent years, an essential contribution in this regard has come from technologies related to Artificial Intelligence (AI) and, in particular, to Machine Learning algorithms (ML). These algorithms can analyze a student’s characteristics, academic history, and learning style to predict the likelihood of success or failure and recommend personalized interventions tailored to their needs. This can help improve the effectiveness of interventions and support. In this regard, this paper presents a project carried out at the University of Catania, Italy, where a novel ML algorithm capable of providing predictions of career outcomes for students in some of the University's departments was developed and tested. The model showed how helpful such tools can be in helping educators and counselors make more informed decisions about students' college careers by analyzing large amounts of data and identifying patterns and trends that may not be immediately apparent through traditional methods.

Project Monitoring and Evaluation: Application of Data Analytics for Interview Response Grading

Presenters: Zhi Li, Carl Westine

This study explores if an automatic text analysis tool, LIWC-22, can effectively assist evaluators in monitoring and formatively evaluating the qualitative data collection process using the example of a grant-funded research project. LIWC has been utilized in various fields. However, no existing study embeds LIWC into a formative evaluation to understand the features of what key stakeholders value in the qualitative data they collect. Following Robinson et al. (2013), LIWC is used to analyze three common questions from 75 interviews about the experiences of underrepresented mathematics doctoral students. LIWC output variables that are highly correlated with team members’ quality ratings of data transcripts will be used. A Principal Component Analysis (PCA) will be conducted to identify the factor structure of the ratings involving the significant LIWC variables. This can assist the evaluator in understanding what linguistic features are more commonly present in the most valued transcripts. Ultimately, the aim is to guide evaluators in identifying practical suggestions for research teams to modify protocols or retrain interviewers.

Revolutionizing Program Evaluation: Harnessing the Power of Artificial Intelligence tools like ChatGPT (Ignite Session)

Day/Time: Friday 2:30 PM – 3:30 PM ET
Presenter: Nishank Varshney
Room: 209

Program evaluation has traditionally been a time-consuming and resource-intensive process due to the nature of the work involved and the skills required. However, with the advent of artificial intelligence (AI) tools such as ChatGPT, the evaluation process has the potential to be revolutionized. Through this Ignite presentation, I will highlight 20 ways evaluators can utilize ChatGPT and similar free or inexpensive AI tools to conduct more efficient, accurate, and effective evaluations.
AI tools can help speed up the process of program evaluation by providing fast and efficient ways to analyze data. In addition to existing AI tools that can analyze massive samples of quantitative data, natural language processing tools such as ChatGPT can understand and analyze text data from surveys, interviews, and other sources. This can save time compared to manually coding and analyzing text data. AI tools like OpenAxis can help find higher-quality datasets and generate visualizations such as graphs and charts in a fraction of the time.
Tools like ChatGPT can also help evaluators in reviewing their evaluation questions and ensuring the evaluation questions align with the purpose of evaluation and the indicators used. It can also help review data collection tools such as surveys, interview protocols, and focus group protocols. Additionally, AI tools like Microsoft Designer, Adobe Firefly, and Photosonic can help design quick presentations and art for evaluators.
Finally, I will also discuss the challenges, limitations, and ethical considerations around the use of AI in program evaluation.

Incorporating Feedback in Theories of Change Through Simulation

Day/Time: Friday 3:45 PM – 4:45 PM ET
Presenters: Cynthia Phillips, Lisa Gajary, Anand Desai
Room: 103

We focus on two technologies — machine learning and computational simulation modeling — to re-imagine the development of Theories of Change (ToC) for evaluating investments in research. Although individual program case studies have been an underutilized source of evidence in evaluations, over the last decade case studies have been introduced in several countries (for example, in the United Kingdom’s Research Excellence Framework) to supplement bibliometric and other information in the evaluation of higher education institutions. First, we describe how machine learning can be used to examine research impact case study narratives so that mental models that scaffold the structure and implementation of research programs can be described. Second, we elaborate on how systems-based approaches to developmental evaluation can aid (1) in identifying mental models that underlie these case studies and (2) in formalizing mental models as computational simulation models. Third, we use these mental models derived from the case studies to develop theories of change of the research programs. In a complex system, there can be multiple paths to reach a specific objective or to bring about change. We identify multiple decision points within a ToC where we can use simulation to examine the potential consequences of choosing different options. We also incorporate feedback and dynamics in a research program’s ToC and use simulations to examine, in silica, their effects on program implementation. We illustrate our approach with a discussion of scenarios where simulations are used to examine multiple ToCs that incorporate the program dynamics and feedback encountered in complex evaluations. We enrich and clarify our insights by weaving in illustrations from a subset of the 6,975 cases submitted to United Kingdom’s Research Excellence Framework (2014).