This is part of a GDPR journey that every organization needs to take, including:
Data discovery and Identification
Automated discovery of data assets and cataloging metadata. Personal and Customer data tends to be spread out across relational databases, archived records in a data lake / warehouse, Mainframes and Distributed File System stores (e.g., Hadoop etc.).
Catalog external data sources
Vendor and third-party data is often collected using different business processes than internal data, and may be used to enhance personal information through record matching and additional processing.
Automatically discover and document data flow and lineage
Data flow of various Critical Data Elements (CDEs) within the enterprise along with lineage is essential to ascertain where customer data moves.
Implement a metadata repository and layer
Customer and prospect data may be spread across data stores. Creating a metadata layer will help abstract the different data sets and apply the proper restrictions to those considered personally identifiable data.