In fast-changing environments, retaining historical accuracy is crucial but often overlooked. How do you ensure reliable, consistent reporting in a dynamic data landscape? This session is designed for data professionals looking to maintain historical accuracy and improve the efficiency of data lake management. When you want to replicate previous reports or analysis as it ran at a specific time or day in the past, a typical database or data lake may have changed over time, and your old queries and analyses break. When using Slowly Changing Dimensions on your data lake, you have a way to preserve the evolving nature of your schema and account for historical records/truths. By adding three columns that log the timestamp of a record and never deleting a column, you can replicate your analysis at any time in the past. With FME, we build up this history and have a special trick using hashing to check which records and schemas have changed efficiently! This presentation will teach you about the data lake medallion approach combined with Slowly Changing Dimensions and how we can speed up the insertion and updating of new records using widely available hashing techniques in FME.
This session will equip you with the knowledge to transform your data swamp into a robust and beautiful data lake, ready to handle historical analyses with ease. Let FME be your guide in making your data lake future-proof!