📍 Delta Lake is an open-source storage layer that brings ACID (atomicity, consistency, isolation, and durability) transactions to Apache Spark and big data workloads.
📍 Using delta lake in a well-organized data lake can aid the integrity and performance of both batches and streaming data solutions.
📍 The common misconception made with Delta lake is that people think it is a data platform service. It’s not. It is simply a format that is defined when you create a DataFrame against the storage layer, and this is what brings the ACID capabilities to the data held in the DataFrames.
📍 We organize our data into layers or folders as defined as bronze, silver, and gold as follows:
➠ Bronze – Tables contain raw data ingested from various sources (JSON files, RDBMS data, IoT data, etc.).
➠ Silver – Tables will provide a more refined view of our data.
➠ Gold – Tables provide business-level aggregates often used for reporting and dashboarding.
More comprehensive knowledge on this topic in this blog: k21academy.com/dp20317
As a bonus to read this message, register for FREE LIVE CLASS at k21academy.com/dp20302 to see why everyone working on Data (Data Architects, Data Analysts, Data Admin, Database Administrator or even Data Scientists) should go for this certification.
FREE Telegram group >> https://t.me/k21microsoftazure
Oracle ACE, Author, Speaker and Founder of K21 Technologies & K21 Academy : Specialising in Design, Implement, and Trainings.