Finding available data can be difficult, but our data services librarian can help with the process. To get started, you can find some available health datasets linked on our guide.
Once you find your data, how do you make sure the data you find is credible? Here are some steps for you to keep in mind to ensure what you find is useful for your project.
- Look for supporting documentation outlining what the data is, how it was collected, and how to interpret the data.
- Tip: Look for readme files, data dictionaries/codebooks, and a collection methodology
- Make sure you can open all files associated with the data.
- Ensure that all files are clearly labeled and store the information and/or data that is indicated in the file name.
- Within the data files, check for the following:
- Variables are clearly labeled with standard naming conventions.
- Example: First names are labeled as FirstName and last names are labeled as LastName
- Units of measurement for different variables are explicitly stated.
- Example: You can tell if measurements are given in centimeters (cm) or inches (in)
- Each variable contains a discrete unit of information.
- Example: blood pressure and zip code are stored in separate columns
- Variables follow data standards and have consistent formatting.
- Example: All dates are in yyyy-mm-dd format
- Variables are clearly labeled with standard naming conventions.
Still need help finding and evaluating data? Connect with our data services librarian who can help you find the data you need.