πŸ‘‹ I'm Travis Tang. I write about data science here.

Latest

Use ARRAY_AGG to flatten columns to lists.

You need to convert a long table of values into a list? This is the most convenient function. πŸ“Œ Syntax: πš‚π™΄π™»π™΄π™²πšƒ π™°πšπšπ™°πšˆ_𝙰𝙢𝙢(πšŒπš˜πš•πšžπš–πš—) π™΅πšπ™Ύπ™Ό πšπšŠπš‹πš•πšŽ There you go, the column is flattened into a list separated by comma. ❌ To do so, I used to copy the entire column into a spreadsheet tool, transpose it, and use

Pivot Table (From Long to Wide)

Every data scientist, scientist, and engineer should know how to create a pivot table. π™²π™°πš‚π™΄ πš†π™·π™΄π™½ is the best way to do so in SQL. πŸš€ Here are THREE ways to pivot a table for monthly sales. 1️⃣ We find the πš‚πš„π™Ό of all salesperson each month. Here, use πš‚πš„π™Ό(π™²π™°πš‚π™΄ πš†π™·π™΄π™½). 2️⃣ We find the average

Find duplicates with RANK

Let's clean out duplicates! Here's how. 1️⃣ Check if duplicates exists. Use π™Άπšπ™Ύπš„π™Ώ π™±πšˆ + π™²π™Ύπš„π™½πšƒ(*) to find duplicates. See example 1 in image below. 2️⃣ Identify exact rows that are duplicated. Use πšπ™°π™½π™Ί π™Ύπš…π™΄πš(π™Ώπ™°πšπšƒπ™Έπšƒπ™Έπ™Ύπ™½ π™±πšˆ πšŒπš˜πš•πšœ). See example 2. What does πšπ™°π™½π™Ί π™Ύπš…π™΄πš(π™Ώπ™°πšπšƒπ™Έπšƒπ™Έπ™Ύπ™½ π™±πšˆ πšŒπš˜πš•πšœ) do? πŸ‘‰ divides the table into multiple partitions, one for each unique value

Find Year-on-Year Growth Rate

This is a common interview question that you MUST nail! The formula is (value - last_year_value) / last_year_value. You can calculate last_year_value as πŸ“Œ 𝙻𝙰𝙢(πšŸπšŠπš•πšžπšŽ, 𝟷) πš˜πšŸπšŽπš› (πš˜πš›πšπšŽπš› πš‹πš’ πšŸπšŠπš•πšžπšŽ) 𝙻𝙰𝙢(πšŸπšŠπš•πšžπšŽ,𝟷) gets the value from the previous row. 𝙻𝙰𝙢(πšŸπšŠπš•πšžπšŽ,𝟸) gets the value from the row before the previous row. STEPS: 1️⃣ Use 𝙻𝙰𝙢 to get

Use EXISTS to check if a row is present

Need to find if records from table A can also be found in table B? EXISTS is here to help. πŸ“Œ Here's the syntax. πš‚π™΄π™»π™΄π™²πšƒ * π™΅πšπ™Ύπ™Ό πšπšŠπš‹πš•πšŽ_𝚊 πš†π™·π™΄πšπ™΄ π™΄πš‡π™Έπš‚πšƒπš‚ ( πš‚π™΄π™»π™΄π™²πšƒ * π™΅πšπ™Ύπ™Ό πšπšŠπš‹πš•πšŽ_πš‹ πš†π™·π™΄πšπ™΄ πšπšŠπš‹πš•πšŽ_𝚊.πš”πšŽπš’ = πšπšŠπš‹πš•πšŽ_πš‹.πš”πšŽπš’ ) πš’πš—πš—πšŽπš›_πššπšžπšŽπš›πš’ Let's walk through this. 1️⃣ the inner query retrieves all the records from table_b that has a common key with table_a.

Use SAFE_DIVIDE to handle division by zero.

π™Žπ˜Όπ™π™€_π˜Ώπ™„π™‘π™„π˜Ώπ™€ divides two numbers while handling the case where the denominator is zero. πŸ“Œ Here's the syntax. πš‚π™΄π™»π™΄π™²πšƒ πš‚π™°π™΅π™΄_π™³π™Έπš…π™Έπ™³π™΄(πš—πšžπš–πšŽπš›πšŠπšπš˜πš›, πšπšŽπš—πš˜πš–πš’πš—πšŠπšπš˜πš›) π™΅πšπ™Ύπ™Ό πš—πšžπš–πš‹πšŽπš›πšœ If the denominator is zero for any row, it returns NULL instead of causing your query to fail. π™Žπ˜Όπ™π™€_π˜Ώπ™„π™‘π™„π˜Ώπ™€ is not part of the standard SQL language, but it is often included as a