
Data Studio
The first product of Datasaur that focuses on providing the best data labeling tool for natural language processing (NLP) projects. Started from text files, today’s Data Studio now handling various use cases including audio transcription and OCR.
Timeline
2019 - 2025
Company
Datasaur
How I helped
Established the design foundation of Data Studio
Led the development team as Product Manager (2021 - 2023)
Oversaw ongoing design development and feature improvements
Hands-on contribution
View work
Tools
Figma, FigJam, Notion, Github
5,000+
projects created
70%
faster labeling time
100%
highest retention rate
01
Overview
I am the first full-time designer at Datasaur, a role that I’ve held for over five years. During this time, I’ve had the opportunity to define and shape the company’s design direction from the ground up. I was tasked not only with building more features for Data Studio but also establishing the design process and systems that would scale with the company.
In addition to my design responsibilities, I also took on product management duties for two years, collaborating closely with engineering and business teams to ensure the product aligned with our company’s goals and met user needs.
02
Challenges
Shinkansen-level iteration
There were only five engineers, one part-time designer, and CEO by the time I joined Datasaur. Data Studio was still in the beta version as it was planned to be launched for the first time in March 2020. There was only two main screens that made Data Studio: a table that lists all projects and the labeling interface.

As the only designer, I had to iterate on new features rapidly while building the design culture bit by bit. In the first two years, we shipped probably more features than we could count, testing the waters to see which ones resonated most with users.
Dual role
In mid 2021, I stepped up to take on product management responsibilities to support the growing team and feature set, while continuing my role as product designer. I dedicated half of my time to the development team and the rest to the design team.
Juggling both roles was challenging, as I learned the ropes of being a product manager while managing my design team. Along the way, my path to growth felt ambiguous since I was the only one taking this dual path. I realized I had to choose which role to prioritize. In late 2023, I decided to dedicate myself fully to design and officially assumed the role of design lead.
03
Process
Research
Establish a baseline benchmark
In the first few weeks after joining, I conducted contextual observations to watch how labelers performed manual annotation in Excel. I measured the time-on-task for each annotation to quantify their workflow efficiency. Through this research, we identified the primary pain point—labeling time was too long—and established a baseline benchmark, which showed that Data Studio at that time could improve labeling speed by up to 70%. This benchmark later guided the design and prioritization of features aimed at streamlining the labeling process.
Know your competitors
The data labeling space was largely unexplored from a design perspective—there were no established patterns or best practices. While designing creatively, I also monitored competitor offerings and maintained an internal comparison table to track feature gaps and opportunities for Data Studio. Since Data Studio is closely tied to machine learning development and technical jargon, I also reviewed research papers to better understand certain features (looking at you, Inter-Annotator Agreement).

Development
Design culture
As Datasaur’s first designer, I shaped the design culture while closely coordinating with engineers, leveraging my CS background to align on expectations. We established design review sessions, evolving from quick live reviews to dedicated slots every two days.
The design process quickly evolved to keep pace with growing features and team size, resulting in key outputs such as the Design Library and Content Guidelines.
Product development cycle
One of our primary goals is to create the most user-friendly data labeling tool in the industry. From day one, we involved all roles in building the product and shaping the product development cycle into what it is today. I was among the first non-engineers to review pull requests—a task I handled alone as a design lead for years before expanding the practice to other designers to ensure both functionality and pixel-perfect implementation.
04
Current state
Over the past six years, Data Studio has evolved into a robust platform that helps teams efficiently manage a wide range of text-based labeling tasks, speeding up dataset creation while improving accuracy. Supported use cases include:
Named entity recognition
Part of speech
Sentiment analysis
Classification
Audio transcription
Optical character recognition (OCR)
Conversational

Beyond labeling, Data Studio provides powerful workflow management features that help enterprise teams maintain quality and scale operations. These features include:
Peer-review consensus and dynamic review workflows for reliable annotations
Analytics, inter-annotator agreement, and project reports to track team productivity
Custom script support for flexible file processing and dataset customization

The structure
The IA diagram was created by mapping Data Studio’s core workflows and features into functional groups, then arranging them hierarchically. This approach illustrates the relationships between labeling, team management, analytics, and customization, helping stakeholders and team members quickly understand the platform’s structure.

05
Results
Impacts and outcome
Data Studio has contributed the most revenue to our company and continues to serve as its main backbone to this day. We have partnered with data labeling services such as CloudFactory and iMerit, and some of our customers include major companies like Google, FBI, and Spotify.
Business metrics
Since we focus on B2B, we track the retention rate of our customers each semester, with the highest being 100% between January and June 2025.
Our average resolution time is 4 days, which has been noted by our customers due to excellent customer response and service.
06
Reflection
Ensure the implementation matches the design
Back in our early days, we had to deliver features rapidly without proper QA, which came back to bite us a year later when the production version started to differ from the design, resulting in growing tech debt. As with common tech debt, small issues were often overlooked in favor of higher-priority items, and these minor mishaps continued to pile up.
When we expanded the team, the engineering, product, and design teams agreed to involve designers as PR reviewers. This extra step was implemented to ensure that UI quirks were always identified and addressed before reaching the QA phase. By involving designers in PRs, this encourages engineers to pay more attention to detail, ensuring that this extra step does not affect development time due to back-and-forth reviews.
Starting in 2024, the design team began tracking this as a design consistency metric. To date, our development team has achieved approximately 90% design consistency, with a high of 95% between January and June 2025.
Always update the master design files
Growing the team means more requests from customers. As we are expected to deliver results rapidly, this led to bloated Figma files filled with unused iterations and already implemented handoffs. This caused the files to take longer to load and made it difficult for designers to confirm which screen to use for the next iteration, as the master files were outdated.
As our design team is centralized, we establish a scope for each designer as the PIC, where they are responsible for a list of scopes/features. Each feature has its own playground and master file, and the designer responsible for a particular feature ensures that these files are kept up to date. This solution helps resolve the issue of outdated Figma files, making it easier for other designers—who are not the PIC—to work on issues outside their assigned scope.