Data Studio

The first product of Datasaur that focuses on providing the best data labeling tool for natural language processing (NLP) projects. Started from text files, today’s Data Studio now handling various use cases including audio transcription and OCR.

Timeline

2019 - 2025

Company

Datasaur

How I helped

  • Established the design foundation of Data Studio

  • Led the development team as Product Manager (2021 - 2023)

  • Oversaw ongoing design development and feature improvements

  • Hands-on contribution

Tools

Figma, FigJam, Notion, Github

5,000+

projects created

70%

faster labeling time

100%

highest retention rate

01

Overview

I am the first full-time designer at Datasaur, a role that I’ve held for over five years. During this time, I’ve had the opportunity to define and shape the company’s design direction from the ground up. I was tasked not only with building more features for Data Studio but also establishing the design process and systems that would scale with the company.

In addition to my design responsibilities, I also took on product management duties for two years, collaborating closely with engineering and business teams to ensure the product aligned with our company’s goals and met user needs.

02

Challenges

Shinkansen-level iteration

There were only five engineers, one part-time designer, and CEO by the time I joined Datasaur. Data Studio was still in the beta version as it was planned to be launched for the first time in March 2020. There was only two main screens that made Data Studio: a table that lists all projects and the labeling interface.

As the only designer, I had to iterate on new features rapidly while building the design culture bit by bit. In the first two years, we shipped probably more features than we could count, testing the waters to see which ones resonated most with users.

Dual role

In mid 2021, I stepped up to take on product management responsibilities to support the growing team and feature set, while continuing my role as product designer. I dedicated half of my time to the development team and the rest to the design team.

Juggling both roles was challenging, as I learned the ropes of being a product manager while managing my design team. Along the way, my path to growth felt ambiguous since I was the only one taking this dual path. I realized I had to choose which role to prioritize. In late 2023, I decided to dedicate myself fully to design and officially assumed the role of design lead.

03

Process

Research

Establish a baseline benchmark

In the first few weeks after joining, I conducted contextual observations to watch how labelers performed manual annotation in Excel. I measured the time-on-task for each annotation to quantify their workflow efficiency. Through this research, we identified the primary pain point—labeling time was too long—and established a baseline benchmark, which showed that Data Studio at that time could improve labeling speed by up to 70%. This benchmark later guided the design and prioritization of features aimed at streamlining the labeling process.

Know your competitors

The data labeling space was largely unexplored from a design perspective—there were no established patterns or best practices. While designing creatively, I also monitored competitor offerings and maintained an internal comparison table to track feature gaps and opportunities for Data Studio. Since Data Studio is closely tied to machine learning development and technical jargon, I also reviewed research papers to better understand certain features (looking at you, Inter-Annotator Agreement).

Development

Design culture

As Datasaur’s first designer, I shaped the design culture while closely coordinating with engineers, leveraging my CS background to align on expectations. We established design review sessions, evolving from quick live reviews to dedicated slots every two days.

The design process quickly evolved to keep pace with growing features and team size, resulting in key outputs such as the Design Library and Content Guidelines.

Product development cycle

One of our primary goals is to create the most user-friendly data labeling tool in the industry. From day one, we involved all roles in building the product and shaping the product development cycle into what it is today. I was among the first non-engineers to review pull requests—a task I handled alone as a design lead for years before expanding the practice to other designers to ensure both functionality and pixel-perfect implementation.

04

Current state

Over the past six years, Data Studio has evolved into a robust platform that helps teams efficiently manage a wide range of text-based labeling tasks, speeding up dataset creation while improving accuracy. Supported use cases include:

  • Named entity recognition

  • Part of speech

  • Sentiment analysis

  • Classification

  • Audio transcription

  • Optical character recognition (OCR)

  • Conversational

Beyond labeling, Data Studio provides powerful workflow management features that help enterprise teams maintain quality and scale operations. These features include:

  • Peer-review consensus and dynamic review workflows for reliable annotations

  • Analytics, inter-annotator agreement, and project reports to track team productivity

  • Custom script support for flexible file processing and dataset customization

The structure

The IA diagram was created by mapping Data Studio’s core workflows and features into functional groups, then arranging them hierarchically. This approach illustrates the relationships between labeling, team management, analytics, and customization, helping stakeholders and team members quickly understand the platform’s structure.

05

Results

Impacts and outcome

Data Studio has contributed the most revenue to our company and continues to serve as its main backbone to this day. We have partnered with data labeling services such as CloudFactory and iMerit, and some of our customers include major companies like Google, FBI, and Spotify.

Business metrics

Since we focus on B2B, we track the retention rate of our customers each semester, with the highest being 100% between January and June 2025.
Our average resolution time is 4 days, which has been noted by our customers due to excellent customer response and service.

06

Reflection

Ensure the implementation matches the design

Back in our early days, we had to deliver features rapidly without proper QA, which came back to bite us a year later when the production version started to differ from the design, resulting in growing tech debt. As with common tech debt, small issues were often overlooked in favor of higher-priority items, and these minor mishaps continued to pile up.

When we expanded the team, the engineering, product, and design teams agreed to involve designers as PR reviewers. This extra step was implemented to ensure that UI quirks were always identified and addressed before reaching the QA phase. By involving designers in PRs, this encourages engineers to pay more attention to detail, ensuring that this extra step does not affect development time due to back-and-forth reviews.

Starting in 2024, the design team began tracking this as a design consistency metric. To date, our development team has achieved approximately 90% design consistency, with a high of 95% between January and June 2025.

Always update the master design files

Growing the team means more requests from customers. As we are expected to deliver results rapidly, this led to bloated Figma files filled with unused iterations and already implemented handoffs. This caused the files to take longer to load and made it difficult for designers to confirm which screen to use for the next iteration, as the master files were outdated.

As our design team is centralized, we establish a scope for each designer as the PIC, where they are responsible for a list of scopes/features. Each feature has its own playground and master file, and the designer responsible for a particular feature ensures that these files are kept up to date. This solution helps resolve the issue of outdated Figma files, making it easier for other designers—who are not the PIC—to work on issues outside their assigned scope.

All rights reserved © 2025.