I recently stumbled across Symbolic Connection, a data podcast ran by experienced data practitioners, Thu Ya Kyaw and Koo Ping Shung. This podcast caught my attention as the guests are mainly from Singapore, one of the burgeoning tech hub of South-East Asia (it was also from where I am based.)
As a data analyst in a tech company, I am always curious about the work of analysts in other tech firms. So I listened to the podcast featuring Cliff Chew, a senior data analyst at Grab.
In this podcast, Cliff shared about how his role as a data analyst in Grab, lessons he had as an analyst and advice he has for aspiring data practitioners.
I found the podcast rather enlightening — and summarized it in this blog post to share it with the data community at large.
About Cliff Chew
Cliff an NUS Economics degree holder who has worked in a few tech companies as a data analyst for a few years. He has worked as a civil servant and a researcher, and have worked with economists, sociologists, geographers, statisticians, architects, digital marketers, designers, fraud analysts, data engineers, data scientists and fellow data analysts.
Most recently, he was a Digital Marketing Analyst at Carousell and moved on to Grab in 2019 as a senior data analyst in the trust, identity and safety team which detects fraud.
That’s enough talk. Let’s get to the interview!
What are the tools that you use at work?
Some of the tools that he used as an analyst are
- Google Sheets, for collaboration with stakeholders
- Tableau for data visualization
He also emphasized the importance of SQL as a tool for data extraction at scale. In fact, in his opinion, it is more important to learn SQL than python.
He also went into a discussion comparing R and Python. While R is friendlier to non-technical folks than Python, it is less easy to productionize and integrate with existing systems than Python.
Overall, he also saliently pointed out that it doesn’t matter which tool you use exactly, as he commented that
“I wouldn’t say you have to learn python or R, because tools always change. But being able to execute data processing procedures in a scalable way is necessary in the current day and age of data economy.”
What is the most important thing have you learnt in your career?
Thinking about problems in probability.
As Cliff was looking to jump into the data science field from economics, he was evaluating his chances using probability by looking at market trends and his existing skill sets.
It is evident that to be interested in data analyst, one most probably will need to know how to code. So he picked up programming.
How did you learn programming?
He first built up his basic programming language in python, starting from basic concepts like numbers and strings manipulation.
These building blocks then helped him at building projects to consolidate his learning. Combining his interest in basketball and his knowledge web scraping, he cleaned the data from the official NBA website and ran through models like logistic regression, SVM and random forest. Eventually, he was able to build a pipeline to scrape data and make daily predictions. The end-to-end flow took 3-years.
Speaking about a lesson that he learnt from this project, he lamented the dirtiness of the data from an official source (NBA), and mentioned —
Don’t trust your data source and always check your data.
What are the sexy and non-sexy parts of your current role as an analyst?
Whether a role is sexy or otherwise is dependent on a person’s personality, Cliff quipped. Regardless, he shared some highlights from his roles.
Interacting with stakeholders
As part of the digital marketing team in Carousell, he interacted with multiple stakeholders — technical or otherwise. As an analyst, he’s the connector between technical and non-technical people to create the pipeline from engineers to the end stakeholders.
He also spent time educating non-technical stakeholders on the definitions of technical terms like “real-time” or “AB testing” and conveyed the limitations of existing data. The key of such communication also lies in conveying technical information in a non-technical manner, or as Cliff put it,
“you have to explain to them in a way that makes them understand instead of simply throw them technical terms,”
Dealing with Uncertainty
Uncertainty in Cliff’s role comes in different forms. One form is the uncertainty over the statistical tests.
“I get asked: why are all your AB tests failing? [The answer is] because they are tests; we are not proposing the truth, but simply hypotheses.”
Another form is uncertainty over the cleanliness of the data. The use of multiple sources of data, say Facebook, Google and also internal data, gives rise to data discrepancy. Thus, it is important to know how to explain to the stakeholders the limitation of the data in the event of such discrepancy.
In the age where companies want to move fast and break things, it is understandable that documentation is swept under the rug. However, documentation is important because of at least two reasons.
Firstly, documentation can remind yourself of you have done. Cliff organizes his work in folders and pulls them up when he needs a reminder.
Secondly, it allows others to build on your work quickly.
What qualities would you look for if you were to hire someone for your role?
The ideal candidate would be
- Humble and coachable, since there is a lot to learn and unlearn.
- Intellectual curiosity to understand concepts from statistics and programming.
- Good with people, technical or otherwise.
- Able to work with uncertainty and incomplete information
Do you have any advice for those looking to switch careers into the data?
Learn SQL (Structured Query Language)
Cliff mentioned that SQL should be the most important thing that people should learn, as it allows you to do data extraction in an optimized manner.
Learn Python and/or R
In deciding whether to learn python or R, Cliff encouraged listeners to do their research into the tool of preference in their respective industries.
For instance, the finance industry might require analysts to have some proficiency in VBA, and some other industry might like analysts who know SPSS.
Know your comparative advantage
It is important to know what your advantage over the other candidates are and know how to position yourself. For instance, someone from finance has a higher probability of breaking into data in the fintech space than others because of your prior knowledge.
When Cliff was looking at a role, he positioned himself as having a mix of domain and technical knowledge —
“I know more domain knowledge than a computer scientist and more statistics than a business person.”
Value opportunities over big names
When looking for a role, one should consider perusing the job description and seek out for growth opportunities rather than chasing big names. This is especially true for fresh graduates.
One of my biggest takeaway from the podcast is the importance of effective communication as a skill for a data analyst. Data analysts are the middlemen between the technical and non-technical stakeholders, and are essential in preventing a communication breakdown!
If you’d like to keep in touch with Cliff, please feel free to connect with him on his LinkedIn.
Want to listen to the podcast that I just summarized? Listen to it here.
If you enjoyed this article, do connect with me on LinkedIn! I write content on learning data science, data science tips and tricks, and technology in general.
Special thanks to Cliff Chew, Koo Ping Shung and Thu Ya Kyaw for their permission to publish this article.