Data Harmony: Clean & Organize
In the music industry, we are creating data faster than ever before. We’re also attempting to use it across many areas, from insights to attribution and accurate royalty disbursement. It’s well known that the data is often incomplete and inconsistent, or riddled with typos and duplications, leading to inefficiencies and inaccurate analysis. Ultimately, this becomes a huge drain on time and resources.
I often hear talk about data cleaning, but a point of failure arises when data cleaning initiatives aren’t approached with a corresponding strategy for ongoing input of high-quality data as well.
What is data cleaning?
Data cleaning is the act of correcting inaccurate, incomplete, or inconsistent data. It’s preferable that the data set is cleaned in the source database, but it’s also important to evaluate all downstream systems for impact and synchronization.
The goal is for data to be:
Complete: Not missing any essential components (for example: release date, writer name, etc.)
Consistent: The data contained within the set is formatted the same way (for example: date format)
Correct: Free of outliers (for example: typos and duplications)
It is often assumed that data cleaning can be done solely through automations and machine learning. To an extent, technology can help, but I find it most useful for volume reduction. Using tech to clean a data set helps you weed out the great data from the bad, bringing focus to the ambiguous data in the middle that requires human review.
Data clean initiatives aren’t easy - especially in music, where very little is standardized. They can be slow, cumbersome, and resource intensive. A member of my former team once projected that cleansing a particular data set was going to take 107 years, based on existing resources, pace, and the level of research needed. It was a number so crazy that we couldn’t help but laugh, until we discovered that the C-suite had gotten wind of it!
I’ve led many data cleaning initiatives over the years, from confirming dates, revising taxonomy, catalog migrations and ingestion, fixing typos, and addressing missing detail. In my observation, data cleaning must be accompanied by an organized data strategy in order for a cleaning initiative to be worthwhile. Otherwise, you are likely not cleaning as fast as new data is being received and not making the desired impact.
How can we make progress?
Imagine you were trying to organize your closet, emptying drawers and assessing what to keep versus discard, removing stained clothes or things that don’t fit. Meanwhile, somebody keeps throwing more clothes on top of the pile you’re looking through. You’d probably tell them to stop bringing things so you could sort through what you have first.
Obviously, it’s unrealistic that we would stop releasing music so that we can clean up the data. That doesn’t mean one can’t take a proactive approach to organize inbound data so that it stops adding to the pile and prohibiting progress.
Let’s go back to the pile of clothes. Maybe the person adding to the pile could bring you the clothes pre-sorted by size and type of item (shirt, pants, shoes, etc.). Maybe they’re even able to arrange it by color and confirm whether a tag is attached before bringing it to you. It would be a lot easier, and quicker, to make sure it gets organized into the right place without a lot of additional review.
Data cleaning may be reviewing what’s already there, but you’re unlikely to see the true benefit without also implementing a strategy to ensure new items are in good quality and immediately put in their proper place.
Data culture matters
Dirty data often starts at the earliest moments in the music process—it’s a management team providing incomplete label copy, the intern entering a title with a typo, the migrated catalog from 30 years ago, or the independent artist who doesn’t know what data they’re supposed to provide for release (to name just a few).
I’m not suggesting that everybody in the industry will be knowledgeable and agree on all data elements, but we could be doing more.
Fostering a data culture within an organization means to:
Train and empower employees (at all levels) with the skills and tools to use data
Ensure transparency and dialogue around data, where data issues can be highlighted without judgement
Agree upon, and document, a cohesive set of data rules for all employees to follow - bonus if you can build safeguards into the systems you are using
Build data-forward workflows to maintain the integrity of data as it moves throughout your business
Highlight the value of data and the role that each employee plays in maintaining it
In the independent sector, there are still ways to help creatives understand critical data components through the tools that they use; such as building data checks into the self-distribution process, and expansion of efforts such as the ASCAP Data Health Check. Equalizer also offers plenty of support in this area.
The music tech addition
New music companies, whether in the music tech space or not, could really make a difference here. They have a rare opportunity to start with a blank slate, defining their internal data standards, methods to monitor quality and compliance, and ways to identify and quickly verify outliers in the data sets. This is also a great time to think about whether the planned processes for implementation are supporting good data practices from the start. Otherwise, they may find themselves a few years in and back in the never-ending, and overwhelming, cycle of data cleaning without sufficient forward progress.
After all, when building a brand new closet, ideally it would be designed to properly house what is already existing and known, as well as leaving space for new things to be added.
Look both ways
Data cleaning is more than a chore - it’s a strategic advantage. Combined with a well-structured organization strategy it can set up your music company, or music career, for long-term success.
Ready to take your data to the next level? Whether you’re just starting to tackle a messy catalog or looking to build better workflows, Equalizer Consulting can help. Let’s work together to build a strong foundation for your data. Contact us today to learn more.