Optimizing Data Ingestion Workflows: A Jobs-To-Be-Done Study for Data Engineers

Optimizing Data Ingestion Workflows: A Jobs-To-Be-Done Study for Data Engineers

Context

In the spring of 2024, the Data & AI user research team conducted a jobs-to-be-done (JTBD) study to impact the roadmap for watsonx.data. Using Anthony W. Ulwick’s JTBD framework, I partnered with another researcher to investigate what data engineers needed to "ingest" data.

Due to the nature of this project, I have omitted detailed information and am focusing on the process.

Role

Researcher

Toolbox

Research, Surveying, Usability Testing

Timeline

October 2023 - December 2023

Role

Researcher

Toolbox

Research, Surveying, Usability Testing

Timeline

October 2023 - December 2023

Role

Researcher

Toolbox

Research, Surveying, Usability Testing

Timeline

October 2023 - December 2023

Watsonx.data, released in May 2023, is IBMs open, hybrid and governed data store.

Upon release, the Product Management team needed a deeper understanding of data engineers’ complex needs to shape the product roadmap. To support this, UX research applied the Jobs-to-Be-Done (JTBD) framework, which focuses on understanding users’ underlying goals rather than specific solutions.

Partnering with another researcher, we defined the primary job and created a job map for data engineers performing data ingestion. This is the first step in the data pipeline where data from multiple sources is collected and moved to a storage system such as watsonx.data for analysis and use. This work helped the team identify key pain points and informed actionable recommendations for the product roadmap.

Upon release, the Product Management team needed a deeper understanding of data engineers’ complex needs to shape the product roadmap. To support this, UX research applied the Jobs-to-Be-Done (JTBD) framework, which focuses on understanding users’ underlying goals rather than specific solutions.

Partnering with another researcher, we defined the primary job and created a job map for data engineers performing data ingestion. This is the first step in the data pipeline where data from multiple sources is collected and moved to a storage system such as watsonx.data for analysis and use. This work helped the team identify key pain points and informed actionable recommendations for the product roadmap.

The first phase was to conduct desk research and internal interviews to draft a hypothesized main job, which laid the the foundation for the job map.

Desk Research

I learned the JTBD framework through reading and reviewing previous Data & AI studies, and built knowledge of data stores using articles, podcasts, YouTube, and past research.

Internal SME Interviews

We conducted 17 interviews with internal SMEs for the IBM products set to integrate with watsonx.data. We met with PM, Design, Engineering, and Sales over the course of 2 weeks with the following objectives:

  • Outline the (preliminary) main job

  • Validate the job executor is a data engineer.

Synthesis

Based on the internal interviews, we created a hypothesized main job for data engineers doing data ingestion: to build solutions for moving data from a source to target storage system in preparation for use in business-relevant tasks.

Hypothesized Main Job

Key Findings

We also learned:

  • Ingestion is often understood as “Extract and Load,” with “Transform” being optional.

  • The Data Engineer’s challenges during data ingestion have to do with data governance and intended use

  • Maturity and size of company define the roles and their boundaries of responsibilities

The second phase was to validate the hypothesized main and create a job map for data engineers doing data ingestion.

Screener

Other research squads were conducting JTBD studies for data engineers involved in data preparation and governance. At this point, the squads converged to create a shared screener since data engineers' have responsibilities that can overlap across ingestion, preparation, and governance. Because of this, we allowed participants to participate in a maximum of two interviews across squads.

External Interviews

The Ingestion squad conducted 9 interviews with people who identified skills that reflected our hypothesized main job of building solutions to move data from a source to a target storage system. We met with data engineers in addition to system engineers, data scientists, senior data analysts, and a head of technical architecture and IT operations.

Synthesis

Synthesis of the interviews followed the steps outlined by Ulwick. We outlined the job steps, social aspects, emotional aspects, needs, and the circumstance of data engineers focused on data ingestion.

Synthesis template used to outline the job steps, social aspects, emotional aspects, needs, and circumstances of data engineers focused on data ingestion.

Revised Main Job

The hypothesized man job was revised to reflect data discovered in the interviews. The revised main job of was to move data to a centralized repository for use by downstream stakeholders.

The hypothesized main job was revised based on our findings.

We created a job map and identified 48 outcome statements that reflected the revised main job.

Job Map

The next step was to create a job map that reflected the revised job statements and place the job steps into one of eight phases: define, locate, prepare, confirm, execute, monitor, modify, and conclude.

Job map for data engineers doing data ingestion that placed their job steps into one of eight phases: define, locate, prepare, confirm, execute, monitor, modify, and conclude.

Outcome Statements

We used the Outcome-Driven Innovation (ODI) strategy to identify and rank the specific outcomes that are critical for data engineers to successfully perform data ingestion. By focusing on these desired outcomes, the watsonx.data team could directly address data engineers' needs and maximize value within the ingestion phase.

We identified 48 outcomes statements based on the needs and pains expressed by participants in association with the job steps. Outcome statements included things such as measuring the time it takes to understand the end-use goal for data and the time it takes to recognize problems in the pipeline.

Outcome Statements

We used the Outcome-Driven Innovation (ODI) strategy to identify and rank the specific outcomes that are critical for data engineers to successfully perform data ingestion. By focusing on these desired outcomes, the watsonx.data team could directly address data engineers' needs and maximize value within the ingestion phase.

We identified 48 outcomes statements based on the needs and pains expressed by participants in association with the job steps. Outcome statements included things such as measuring the time it takes to understand the end-use goal for data and the time it takes to recognize problems in the pipeline.

48 outcome statements for data engineers before, during, and after data ingestion.

We produced 6 key recommendations that would ensure the AI product addresses data engineers pain points based on the ODI survey.

The findings were presented to watsonx.data's leadership, and my final deliverable for this study was an internal w3 website that allows IBMers to view our findings.

We identified what we believe to be the most impactful outcomes for each phase.

Cross-user research collaboration

This was my first time teaming up with other user researchers for a project. Working with others who had similar skills was rewording because I was able to learn how others conducted research.

Generative Research

My previous studies involved more testing and was at times more straightforward. Due to the nature of this study, I found myself confused or frustrated because at times. During this process, I sharpened my question-asking skills and became more comfortable in ambiguity.

More Projects

Aligning Cross-Functional Teams: Product Discovery Workshop for Data Integration

Enhancing Onboarding: A Benchmark Study of New User Experiences

More Projects

Aligning Cross-Functional Teams: Product Discovery Workshop for Data Integration

Enhancing Onboarding: A Benchmark Study of New User Experiences

More Projects

Aligning Cross-Functional Teams: Product Discovery Workshop for Data Integration

Enhancing Onboarding: A Benchmark Study of New User Experiences

Contact

Have questions? Get in touch!

@ 2025 Designed & Created by Lesedi Khabele-Stevens using Figma & Framer

@ 2025 Designed & Created by Lesedi Khabele-Stevens using Figma & Framer

@ 2025 Designed & Created by Lesedi Khabele-Stevens using Figma & Framer