Disclaimer: Excluding the TL;DR section, this article was written without any assistance from LLMs such as ChatGPT. While ChatGPT will probably write a better article than I do, my motivation is to express some candid thoughts and share my experiences. Writing this took way longer than it should have, but it’s always a nice feeling to pen down my reflections and have a finished product to show.

TL;DR (with the help of ChatGPT 🤖)

Responding to a Medium article that I chanced upon, I discuss the evolving role of dashboards in data teams. While dashboards are not dying, the focus has shifted towards delivering information in the best way possible. Drawing from my personal experiences interning at fintech and e-commerce companies, I share insights on the varying needs for data presentation and the challenges faced in managing dashboards. I also explore the potential impact of AI on data teams and suggest that embracing AI can enhance workflows and create new opportunities for data practitioners. I conclude by expressing optimism about the future of technology roles and the value we can find in working with AI.

Summary of the original Medium article by Taylor Brownlow

Taylor Brownlow, a data advocate, argues that dashboards are dying (not dead), not because of anything wrong with dashboards, but rather with everything around them, such as relationships, communication, processes, and people. With a rapidly growing offering of data-provisioning alternatives, data teams have been moving away from “how to make this dashboard great” towards “what’s the best way to deliver this information?” Brownlow shares three persistent problems data teams need to solve to make that leap forward.

First, data teams have been quick to integrate modern data pipelines and environments into the technical stack, but they have yet to figure out how to leverage that innovation to deliver greater value to stakeholders. Next, data teams must learn to build trust (internally and externally) not just through reliability and accuracy but also through communication and creating a safe space for mistakes. Finally, data teams should consider adopting tools that prioritize collaboration, data transparency, and experimental flexibility.

Overall, Brownlow is optimistic about the nascent changes happening in the data industry.

Drawing Parallels with Personal Experience

I’ve had the privilege of interning at two exceptional data teams operating in different spaces, one in fintech (Funding Societies) and the other in e-commerce (Shopee). Each team was solving a different set of problems using different methods, but both had a common goal of delivering reliable data and quality insights to drive decision-making.

My First Foray into Data

I did my very first internship at Funding Societies (FS), a regional fintech startup, in the summer of 2021. It was an amazing experience learning how the data team (comprising engineers, scientists, and analysts) functioned as one cohesive unit. Unlike regular internships where you are usually assigned an intern project, I wore many hats at FS. I dug through thousands of lines of SQL to resolve production bugs, collaborated with business teams based in Singapore, Jakarta, and Kuala Lumpur on auditing and product releases, and built/refactored dozens of dashboards, among other tasks.

I soon realized that not everyone needs a dashboard. Some stakeholders simply needed an Excel dump to run their analysis, while others wanted actionable insights and were flexible with the presentation style. In the long run, the data team wanted stakeholders to self-serve data for ad-hoc requests and gradually grant data ownership to external team leads. For that, we looked to Atlan which provided a data catalog of the hundreds of table columns in our Snowflake data warehouse as well as an interface to run simple queries. I worked closely with my manager to populate our catalog, onboard new users, and gather feedback.

I left FS before the decision was made whether to integrate Atlan. We had doubts about the reliability and usability of the service (Atlan was only two years old then) so I’m not sure if we did manage to figure something out. Looking back, tools like Atlan would have helped push our agenda toward collaboration, data transparency, and experimental flexibility (advocated by Brownlow). Even though the data team at FS was lean and young, we embodied a forward-looking approach to managing data.

New Team, Similar Role

I joined Shopee in the summer of 2022, when it had recently overhauled its Business Intelligence division into separate data teams that were assigned to various core groups. For context, Shopee is a mature listed company with ~$19B valuation as of writing. In the Search and Recommendation (SnR) team, analysts worked closely with product managers who were either driving new features or exploring user segmentation. Because each feature release required extensive [A/B testing](https://www.optimizely.com/optimization-glossary/ab-testing/#:~:text=A%2FB testing is essentially,for a given conversion goal.) and validation, most of our work was comprised of analyzing and drawing conclusions from user behavior. Dashboards were surprisingly not the main medium of data presentation. Instead, we mostly relied on Google Sheets for the job. Doing so allowed us to iterate quickly and match the pace of the product development lifecycle.

Dashboards Saved The Day

Two things stood out to me during my time at Shopee.

First, analysts sometimes faced long query times (between 30 minutes and an hour). Granted, we were querying against terabytes (sometimes petabytes) of data, but our data infrastructure should have been robust enough to handle it. We implemented comprehensive [data marts](https://aws.amazon.com/what-is/data-mart/#:~:text=A data mart is a,department-specific information more efficiently.) and adopted tools like Presto and Spark to power big data analytics. So, why were processing times still unbearably long? The root cause, I later learned, was that our ML engineers were also running their experiments. If you know anything about training models, it is that it takes up a ton of resources. And those resources were shared among the entire SnR team.

Okay, so why not just scale horizontally by buying up machines? Recall that Q3 2022 was a hot mess for tech companies, and Shopee wasn’t an exception. Because of long wait times, if I ran a query in the Web UI, I couldn’t lock my computer or put it to sleep. Else, I risked losing all progress that the query has made. The workaround? Deploy queries as jobs on the cloud and pipe the output into Google Sheets.

For less important (but still large) queries, it didn’t make sense to go through all the trouble of deployment. There were “prime” periods throughout the day to run queries, and I knew this… you guessed it: through a dashboard. It was embedded in the server-monitoring internal tool. When the load was low, queries that would have run in 10 minutes ran in just a few seconds. Those real-time metrics were also crucial for validating performance improvements when I experimented on optimizing Spark parameters for batch jobs. Feel free to read my article about tuning Spark if you are interested.