Customer Data Management (CDM) ensures that all customer data in a data warehouse, data lake or data mart are available, accurate and fit to use. In view of data’s importance to a modern bank, many have appointed a senior business executive to be responsible for it.
CDM involves processes that audit data quality and undertake data cleansing activity to maintain and improve its integrity so that it is always available for use.
It should always start with creating quality data in source systems that customers and employees use on a day-to-day basis rather than trying to correct data in data warehouses, data lakes and data marts that may have been modified. Having a data quality culture, where employees understand the importance of complete and accurate data is crucial.
Customer data wasn’t always regarded as being crucial or even important to a bank’s reputation or economic success. Only the customer’s name and address were of any real importance, especially if they were borrowing money.
When processes were manual, data completeness or integrity wasn’t the main concern. The data stayed on a piece of paper that was filed and rarely examined until it was shredded. In the early years of computerisation, few systems used any sort of real-time validation, any checking was manual, and the prime concern was that all the fields were filled – with something.
As a result, where the employee didn’t want to ask the customer’s age, they might input ‘1 January 1900’ to bypass the question. If unchecked, it can cause customers who are over 80 years old to be listed as still living with their parents, according to the bank’s data. Possible, but unlikely.
Or the employee may omit the customer’s second name from the product application form to the computer input, as that’s how the customer is known to them. As systems were product-based without real-time customer matching, the system opened a new product for a seemingly new customer. When the bank created a customer-based platform, it created two profiles for the same customer. When the customer signed up for Internet banking, they’d complain that they couldn’t see all their products.
Computers sometimes had difficulty with unusual spellings. Doctor De’Ath, a not uncommon name, whose family originated from the village of Ath in Belgium, could become Doctor Death, which is embarrassing for all concerned.
Some disgruntled employees changed customers’ names to Disney cartoon characters or something more derogatory if the employee thought the customer complained a lot. Unaware of how data were used, customers were sometimes very annoyed to receive direct marketing that insulted them. It usually resulted in them closing their account.
Some of these mistakes were simply because employees didn’t use computers at home or didn’t understand that the bank was using the data they input for automated direct marketing. Naively, banks assumed data would be transcribed accurately and honestly, often finding out that this wasn’t the case through bad publicity.
Banks instituted regular data completeness and cleansing exercises, but these proved unpopular and difficult to manage. Eventually they implemented ‘fuzzy logic’ based checking and cleaning software and real-time validation, with a dictionary of forbidden words.
Online account opening has mitigated some of this issue, as it is now the customer that gets their own information wrong!
Customer Data Management also involves ensuring data quality by undertaking data audits and cleansing activity. These should be automated but can be manual or semi-manual and should focus on correcting data in source systems, rather than data warehouses, data lakes or marts that use the data.
The process involves removing typographical errors or correcting information against known good values, or alternatively using ‘fuzzy logic’ or ‘approximate string matching’ to correct records that partially match existing verified known records. This may include rejecting any post code or zip code against the most current post office list.
Data harmonisation or normalisation means bringing together varying file formats or naming conventions, for example, transforming abbreviations, such as ‘St’ or ‘Rd’ into ‘Street’ or ‘Road’ respectively to hold common data.