Data Preparation: Data Cleansing

Data Preparation: Data Cleansing
By Jan-Willem Middelburg
Uncover the secrets of effective data preparation and learn the art of data cleansing. Understand how to enhance the quality of your data, ensuring that your analytics and insights are built on a robust foundation.
Data preparation is a critical phase in the data lifecycle, encompassing the processes of cleaning, standardizing, and enriching raw data. It serves as the bedrock for effective analytics and data science endeavors, ensuring that insights derived are not only accurate but also reliable. Surprisingly, a significant portion of a data professional’s time—often exceeding 80%—is dedicated to the meticulous tasks of finding, cleansing, and organizing data. This underscores the pivotal role that data preparation, and specifically data cleansing, plays in the overall data management process.
At the core of data preparation lies the process of data cleaning—an intricate procedure involving the alteration of existing records to identify and rectify, or in some cases, remove corrupt or inaccurate entries from a record set, table, or database. As the linchpin of data quality assurance, data cleaning is essential for mitigating errors and ensuring the integrity of your datasets.
In this webinar, Jan Willem will explain the key objectives of data cleansing and the purpose of the operations typically used in a data cleansing process.
About Speaker
Jan-Willem Middelburg, the author of the Enterprise Big Data Framework publications.
Find out more articles, case studies and webinars in the Big Data Knowledge Base.