home
/
Resources
/
What should your nonprofit do about "bad" data? | The Data Coach | Data Management series

What should your nonprofit do about "bad" data? | The Data Coach | Data Management series

Transcript:

Hi there. My name is Lindsay, I'm The Data Coach, and today we're going to talk about what to do about bad quality data at nonprofit organizations. So normally when folks talk about bad data, they could be talking about a few different things.

They could be talking about old or outdated information. They could be talking about missing information. Sometimes the concern is duplicate records in a system, which screws up all the counting. And then other times we're talking about mistakes or inconsistencies in data entry, which also leads to problems with reporting.

But no matter the root cause, when nonprofits come to me to talk about bad data, what I hear is, "We can't trust the data we have to give us the information that we need." And nobody wants to use untrustworthy data to make decisions about operations, about programs, and especially not to try [00:01:00] to demonstrate organizational impact.

So what are some things that you can do to start building trust in your data again?

If your top concern is old or outdated information, you have a few different options. So, a lot of the times, concern about old information has to do with contact information, right? Because our programs and our fundraising campaigns and our events can't really be successful if we are sending out information to all of the wrong places.

So if you are concerned about address information, one thing your organization can do is get access to the National Change of Address database that is run by the U. S. Postal Service. And I will leave a link in the description box below if you're interested in learning more information about that.

Another thing that you can do is if you have multiple systems at your organization, and there's some overlap in the [00:02:00] people whose information is living in those systems, you can cross check records and make updates based on which record has the most up-to-date information. And then finally, take advantage of opportunities throughout the year to invite people to verify and update their information.

That can be through program signups, that can be through event signups, it can be through donor forms. And depending on how confident you feel in your email and phone data, you can even send out quick surveys throughout the year just asking people to confirm what you have and update as needed.

If you are struggling with duplicate records, and you are using a CRM in your organization, it's very likely that that CRM already has a duplicate management system built in for you to use. If you're not sure how to use that system or where it lives in your database, best thing to do is to contact your help desk to get more [00:03:00] detailed instructions on how to start using that feature.

If you are worried about duplicate records and you are working in an Excel spreadsheet, you have a couple of different options. Let me show you in the demo.

Okay. This is like the 15th time I tried to do this. Um, if it doesn't work this time, then we're just going to, we're going to take a nap. I don't know. Okay. So what you're looking at here is some very, very fake data. And as I looked through my very fake data, I noticed that I might have some duplicate records in here.

First I'm going to try using Excel's de-dupe function. We're going to highlight all of our data. We're going to head on over to the Data tab. Go to "Remove duplicates" and then I get this pop up. This pop up is basically Excel asking me, "What column should I use to try to find your duplicate records?"

So let's say I think that they're all in the first name. I click okay. [00:04:00] And I get this pop up. Three duplicate values were removed, nine unique values remain. Alright, well let me just double check on that to make sure. I go back over and I notice that well, I still have a Lochan Stark and a Lachlan Stark, and those two records look really similar to each other.

So, maybe I didn't do something right. Another alternative is to use conditional formatting. Again, we're going to highlight our data. This time, we're going to go to the Home tab, head on over to Conditional Formatting, select Highlight Cell Rules, and go all the way to the bottom, Duplicate Values.

So this pop up is Excel telling me something. And it's telling me that every record that I might think is a duplicate, I'm going to highlight in this red color. And, you know, you've got a few different options if you want to pick yellow or green. I'll stick with the red for now. I'll click OK. And now you'll notice I have way more potential duplicates.

[00:05:00] So let's take a look at them. First, I'll sort my first name data from A to Z and I can already see problems. I have two bonnies here, brewer with an A, brewer with an E. I know E is the correct entry. So I'm going to take out this duplicate. I can do the same thing for my friend Bruno here, I have Hoffman with two F's, Hoffman with one F, I'm going to take out the one with one F.

Stark has been highlighted, that error that I caught before. I know that person's name is Lochlan, so I'll take this one out of here. And then for Tyrese, I have another last name problem. Coombs with two O's and then Combs with one O. I know this last one is right, so I will take this one out. And now all of my duplicate records have been removed.

Finally, if your big concern is inconsistencies in data entry or mistakes in the [00:06:00] system, gather your team together to figure out where in the database those mistakes are most common, decide how you're going to fix them, and then be sure to document somewhere how the data should be entered moving forward.

Let me give you an example. So say there's an organization that's tracking the number of events people attend throughout the year, and you notice that some people are entering that information as a numeral, like, two, and then other people are entering it as a word like five. So you get together and you decide that to make counts easier from this point forward, you're going to enter that kind of information as a numeral.

Your two. You might assign somebody to go back into the system and fix the errors that are there. And then write up somewhere where everybody can access it what you have decided to do for data entry going forward so that it's available for easy reference.

[00:07:00]

Once you start your short term fixes, then you can start to think about the root causes of the data quality problems that you're experiencing. One thing that I see all the time is that nonprofits rarely have written guidelines for staff, for volunteers to show them the correct way to do data entry and reporting in their systems.

Now, written guidelines are a really great reference tool, particularly for new staff who are coming in, who are not going to immediately memorize all the processes and procedures right up front. If you don't already have written guidelines, you're going to gather your team together, make some decisions about how you want data to be formatted, create shared meanings around the definitions of fields in the database, and create instructions for what to do and who to ask if there are any entry and reporting problems in the [00:08:00] future. You're going to write all that down and make sure that everybody knows where to find the guidelines for easy reference.

Another issue that I see really frequently is a lack of formal training on the system. So, back in the day when I was first getting started in the nonprofit sector, my database training consisted of standing over somebody's shoulder for like an hour, watching how they enter data into the system, and then hoping that I would remember what to do when it was my turn.

And that's not really a great way to train new people on what to do with all this incredible data that you have. So once you have your guidelines and procedures written up somewhere, be sure to design a more structured, formal training program for new and existing staff members to explain what those guidelines are and provide people with an opportunity to ask questions before they dive into the [00:09:00] system.

The third root cause that I see for data quality problems is data collection tools. Now, this can mean anything from event signups to donor forms to program signups. If you notice, for instance, that you're hosting a lot of events at your nonprofit, but you don't have the email addresses for those people who sign up for that event, you're probably going to want to check your event signup form.

First, you're going to want to see if email address is there in the first place. If it's not, make sure you put that in. And then you're going to want to check if email address is required. Because if it's not, people are more likely to skip that, not provide you with that information. So you can go ahead and make your changes as needed.

And then finally, remember that data quality checks are not a one off thing. To ensure good data quality in the long term, create a data maintenance schedule so you can do [00:10:00] checks and fix errors on a regular basis. You can use the reports and processes that you use for the short term fixes to find and catch and correct those errors.

And you can do this on a monthly, a bimonthly, quarterly basis, whatever works for you, your team, and your organization.

Now, if you have gotten to this point and you think, "Wow, that's a lot of work," or "We do not have the people power for this," let me encourage you to think about it this way. Do you really want to wait until that grant report is due to find and fix all of the errors in your database?

Do you really want to submit that board report knowing full well that you did not have time to correct the data issues before you submitted it? Or do you really want to be in a situation where a community member or a key stakeholder calls out problems in your annual or advocacy report, because no one took the time to check the data before sharing it with [00:11:00] the public?

Probably not.

Even though it can be challenging, I think it's really helpful to think about these kinds of data management practices as trust building exercises with your communities and your stakeholders. You're showing them that you're taking the time to make sure the data is as accurate and clean as possible.

Because you want to ensure that they can trust you with the information that you share with them. And building trust should always be a priority for any organization.

And again, if your organization is small, you're concerned about people power. Remember that you can do this at your own pace. You may decide that you want to start with de-duping, and then later fix some of the missing data issues that you see. Or you decide, let's get our data maintenance calendar together, start implementing that, and then we'll embark on the longer term project of coming up with our written guidelines. Whatever works for [00:12:00] you.

And as always at The Data Coach, we are here to support you with every stage of the data management process. If you would like to schedule a free consult call with me, I will leave my Calendly link in the description. You can pick a day and time that works for you. Also check out the other videos under our data management playlist here on YouTube. And you can also check out our website, follow us on the various socials to get little tips and tricks to help you along the way.

Please like and subscribe if you enjoy this video. It lets me know what content is most helpful for you. Also, we're pretty new, so if you wouldn't mind sharing with your nonprofit friends and colleagues, we'd really appreciate it. If you have any questions or any insights you want to share, feel free to leave them in the comments below and they could become the topic of our next video.

And as always, thank you so much for watching and we'll see you next time. Bye!

illustration of two people looking through various data models

Is your data working for you?

Take our free Data Audit Checklist quiz to evaluate your current data practices and discover immediate improvement areas.

Take The Quiz