It began last January with a column I wrote titled “The little dirty data secret,” which shone a light on the issue of data quality — or lack thereof — within many organizations. I called it the dirty little secret because of a reluctance on the part of many to even acknowledge they had a problem.
I spoke with Greg Brown at data quality solution provider Melissa about it, who said for many organizations, poor data quality is “the cost of doing business.” We spoke about the issue in their California offices, and from that conversation was born the SD Times Data Quality Project.
After our initial meeting, Brown worked with us to create a survey of developers as to how much they were responsible for data quality, what their role was in ensuring the data going into their applications was of high quality. In short, more than half of the 202 respondents said they were involved in data quality input, data quality management, choosing validation APIs or API data quality solutions, and data integration.
To help us better understand the issue, Brown described six defined dimensions of data quality, The standard measures of data are accuracy, timeliness, consistency, validity, uniqueness and completeness.
We asked people who took the survey if they would share their stories, and several agreed. They will be appearing on sdtimes.com in the coming weeks. We’ll hear where their data problems exist, what they’re doing to remediate those problems, and where they are at now. Among the issues we’ll be talking about are data integrity, poor documentation, a lack of training for dealing with data, cleaning and optimization, and data management.
In a world where the amount of data organizations are handling has literally exploded, maintaining quality takes more time and effort than ever before. We’re looking forward to bringing you the stories from the field of how organizations today are managing. Join us for the journey.