Enhancing Onboarding: A Benchmark Study of New User Experiences

Enhancing Onboarding: A Benchmark Study of New User Experiences

Context

For my second project at IBM, I conducted baseline benchmarking to test a guided tour in Modeler Flows, a drag & drop Machine Learning (ML) tool.

Due to the nature of this project, I have omitted detailed information and am focusing on the process.

Role

Lead Reseacher

Toolbox

Research, Surveying, Usability Testing

Timeline

December 2021 - January 2022

Role

Lead Reseacher

Toolbox

Research, Surveying, Usability Testing

Timeline

December 2021 - January 2022

Previous research found that it was hard learn Modeler Flows.

Because of this challenge, the Modeler Design team built a Guided Tour that introduced first-time users to the tool by guiding them through the process of building a classification model.

Since the request for research came from the design team, I started by doing stakeholder interviews to learn more about Modeler and the Guided Tour.

After conducting brainstorming sessions with Design and PM, I delivered an official Research Kick-Off presentation that outlined the reasoning behind a benchmark study, the research plan, outstanding questions, and next steps.


Slides from research kick-off presentation that established research objectives and guidelines.

Objectives

  • Test Guided Tour to determine how effective it is at guiding users to build a classification model.

  • Establish baseline UX metrics to measure against in future iterations.

Methodology

  • Benchmark Study - Baseline

  • Moderated, between-groups tests: Modeler without Guided Tour & Modeler with Guided Tour

  • Collect performance measures via task-based testing with follow-up questions

  • Collect participant perceptions via surveys

Target Persona

  • Data Scientist with no prior experience with IBM Watson Studio

Number of Participants

  • Total: 16 (8 - Guided Tour & 8 - Without Guided Tour)

Objectives

  • Test Guided Tour to determine how effective it is at guiding users to build a classification model.

  • Establish baseline UX metrics to measure against in future iterations.

Methodology
  • Benchmark Study - Baseline

  • Moderated, between-groups tests: Modeler without Guided Tour & Modeler with Guided Tour

  • Collect performance measures via task-based testing with follow-up questions

  • Collect participant perceptions via surveys

Target Persona
  • Data Scientist with no prior experience with IBM Watson Studio

Number of Participants

  • Total: 16 (8 - Guided Tour & 8 - Without Guided Tour)

Behavioral Metrics

Metrics


Targets (Industry Standards)

Time on Task

–––>

Time on task < 3x expert task time

Time on task
< 3x expert task time

Task completion

–––>

Unassisted task completion > 100%

Errors

–––>

Errors per task < 3

Assists

–––>

Assists per task < 3

Attitudinal Metrics

Metrics


Targets (Industry Standards)

Single-ease questions (SEQ) - Pre & Post

–––>

SEQ score ≥ 5

System usability scale (SUS)

–––>

SUS score > 66

Errors

–––>

Errors per task < 3

Net promoter score (NPS)

–––>

NPS score > 22

I created a screener targeting data scientists and posted it to 2 recruiting platforms.

Although the study targeted data scientists, I allowed participants with different job titles when they met the required data science skill criteria.

The moderated tests asked participants to complete 2 tasks, a survey, and answer post-task questions.

To evaluate the effectiveness of the tour, we conducted an A/B test in which one group of participants completed the tasks with the tour and the other group completed the same tasks without it. Both groups were asked to 1) sign up for a new account and create a modeler flow, and 2) build a classification model to predict which drug to administer to patients. This setup allowed me to capture performance metrics, observe usability challenges, and gather user perceptions, providing actionable insights to inform future iterations of the experience.

To evaluate the effectiveness of the tour, we conducted an A/B test in which one group of participants completed the tasks with the tour and the other group completed the same tasks without it. Both groups were asked to 1) sign up for a new account and create a modeler flow, and 2) build a classification model to predict which drug to administer to patients. This setup allowed me to capture performance metrics, observe usability challenges, and gather user perceptions, providing actionable insights to inform future iterations of the experience.

Task 1: Create a New Modeler Flow

Task 2: Build a classification model to predict which drug to give patients

Over two weeks, I synthesized insights and developed a note-taking template.

Initially, I focused on aligning the notes with the recordings to ensure no details were missed. As I filled in gaps, I also organized and affinitized the data in Mural, clustering observations to reveal patterns and key themes.

Virtual whiteboarding to answer research objectives

Because the benchmark study captured both quantitative and qualitative data, I needed a systematic way to track metrics alongside observations. To do this, I created a note-taking template that guided my documentation of each participant’s user path as they completed the tasks, ensuring consistency and enabling clear analysis across sessions.

Note-taking template I created that outlined the Golden Paths to complete Tasks 1 & 2.

The tour helped participants complete tasks more efficiently, though observed attitudes and perceptions were comparable across groups.

Behavioral Metrics

On average, participants using the Tour performed Task 2 at a higher success rate than those without.


Time on Task

Task Completion

Errors

Assists

w/o Tour

18 min.

0%

2

2

w/ Tour

13 min.

25%

6

0.3

Attitudinal Metrics

Although the tour supported task completion, attitudinal feedback indicated it needed improvement to better help new users become familiar with Modeler.


SEQ

SUS

NPS

Issues

w/o Tour

3.3

F

-25

9

w/ Tour

4.6

F

-88

10

Synthesizing usability issues into key themes revealed the critical gaps in the Guided Tour experience.

Using an issue severity matrix, I identified that the workflow violated users’ mental models, resulting in “map shock.” I also found that the Guided Tour’s UI was inconsistent and offered only a surface-level introduction to Modeler Flow. Finally, new users perceived Modeler Flows as a beginner-only tool, despite being designed for users of all skill levels.

5 Issue Themes

I presented the findings and 7 recommendations to my cross-functional partners.

First-time use improvements

  • Consider looking at the workflow of other IBM products

  • Work with content designers to craft more user-friendly descriptions for Type node

Update Guided Tour UI

  • Update Guided Tour UI to consistently use Next or X buttons on all tour pop-ups

  • Update Guided Tour panel UI to number steps in tour

  • Update tour options and complexity

Multiple tours based on experience level

  • After tour: Provide documentation & extra resources

  • Consider end-to-end integrated tour across Watson Studio

I ran a follow up unmoderated A/B usability study to solve the problems around type node understanding.

The study’s insights led to the implementation of two new features, enhancing new users’ comprehension of the type node, an essential part of Modeler’s workflow.

Behavioral Note-taking Template

The most tedious part of this research was note-taking and mapping every user’s exact path to complete the task. I wish I had created the note-taking template sooner in the process, but it was all part of my journey to completing my first official user research project.

Qualitative Research

I spent a lot of time with the qualitative data, even though I decided not share it with my cross-functional partners. It was challenging to try to learn R Studio, but I had a lot of fun.

More Projects

Design Challenge Iteration: Updating a Health Tracking App

Optimizing Data Ingestion Workflows: Jobs-To-Be-Done Study for Data Engineers

More Projects

Design Challenge Iteration: Updating a Health Tracking App

Optimizing Data Ingestion Workflows: Jobs-To-Be-Done Study for Data Engineers

Contact

Have questions? Get in touch!

@ 2025 Designed & Created by Lesedi Khabele-Stevens using Figma & Framer

@ 2025 Designed & Created by Lesedi Khabele-Stevens using Figma & Framer