State of data jobs, build a chatbot using SQL, blockchain analytics, and a free data engineering book
Welcome to the second issue of 5-Bullet Data. The goal of this newsletter is to educate, entertain, or inform data professionals.
Some updates from me. Despite the optimism of the stock market and the recent run-ups in stocks of your favorite SaaS tools, cloud optimization is still alive.
I was recently involved in a cost optimization project to reduce costs for reporting and analytics infrastructure. The goal was to save at least 50%. We had multiple reporting applications and databases. They were combined and consolidated. The savings are upwards of $500k. This was done at a small company. Imagine savings opportunities in large corporations. Certainly, in tens of millions.
State of Data Jobs
I have been maintaining a database of different job opportunities in various data tools and technologies. If you are looking for data jobs, LinkedIn is your best bet as it continues to gain market share from Indeed. As for job postings, there are plenty of jobs that require skills in SQL and Python. However, the number of job postings for other tools and databases has been declining steadily. January and September see a significant bump in job postings. If you are looking for a new job, be prepared to apply in January 2024!
Build a simple AI chatbot on your dataset using Postgres and SQL
Imagine building a chatbot using your proprietary data. You want to build something quickly. In this case, vector embeddings can help. Vector embedding is a method of extracting specific knowledge from your dataset and representing it as a vector. Using OpenAI embeddings API, you can pass it a document, and out comes the vector embeddings. @Denis Magda has created a very easy-to-follow tutorial to create a chatbot. If you are a data professional this is very approachable as it uses Postgres to store vector embeddings and then query for similarities. Here is the YouTube video
Analyze Blockchain using SQL
Crypto Data Bytes (@cryptodatabytes) is a collection of tutorials and SQL queries you can use to do blockchain analytics. This is a great way to learn how popular blockchains like Bitcoin and Ethereum work.
PowerBI has now overtaken Tableau as the most popular BI tool
According to @Weng, Power BI continues to take market share from Tableau and is now mentioned more often in job postings. Tableau was acquired by Salesforce in 2019. The company has had leadership changes and in the Salesforce layouts in early 2023, Tableau employees were significantly affected.
Microsoft, on the other hand, continues to bring features to Power BI and uses its distribution muscle to get it into the hands of as many people.
Data Engineering Design Patterns - A free book by Simon Späti
Simon is building this book in public.
This book is a fresh perspective from an experienced data professional. Some of the topics covered include history and evolution, common challenges, and best practices.