Over the years I have worked with many people who do the field work to collect data. They have their process and it works for them. However, their work rarely extends beyond data collection. This often means data is collected in Excel. If an observation meets a specific criterion then they are colored yellow, another criterion and they are blue. If there was a problem with the observation then it is highlighted in bold. This is a very easy approach. It takes very little preparation and one can sling data around like it’s nobody’s business.
When analyzing the data dealing with this on a small scale usually isn’t too difficult. However, when working on a data warehouse sized dataset this becomes an extremely tedious task (if not impossible). The data scrubbing and preparation can be extremely difficult. Take for example a simple yes or no question. Suppose there are four people are entering data about whether soil samples contain a specific contaminate. Each person will begin entering their data and one person enters ‘Y’/’N’, another enters ‘Yes’/’No’, yet another enters ‘y’/’n’. Worse yet the fourth person enters 1 for yes and 2 for no. All of the sudden there are a total of eight codes when there should really only be two.
So some extra work in advance. It’s well worth the time spent to prepare for data collection and have the data entered cleanly rather than trying to make sense of messy data after the fact.
There are a number of solutions that will work depending on the needs of those entering the data. I mention a few here
- Custom developed web application. It take some work and potentially additional development costs but it can be completely customized to the project. I have often taken this approach because it allows me to have a completely customized application for the job.
- Open source client software. There are many open source software packages that can be used for data entry. An example of this is EpiData or EpiInfo.
- Commercial software like Qualtrics. It’s a little pricier but these software packages offer a lot of features.
- Worst case: use SurveyMonkey