I am always amazed by how many bugs can be found with the simplest of setups. For our (physics simulation) software, I would estimate that half of the bugs I find are on a simple cube without doing anything fancy. However, in trying to think of testing as more than just finding bugs and also as something that can help drive quality in a holistic way, the challenge of finding good (interesting) input data becomes more important. For this kind of testing the important thing is variety and variation (as well as getting as close to customer data as possible). So how do I find this kind of input data?
There are a few approaches I take to this. In the first place I can always just create it, and over the last 9 years of testing I have of course created many different kinds of input data that I can use.
This naturally leads to the second approach which is finding data. One of the challenges of having a lot of different files and sets of input is being able to find the one(s) you need in your current situation. This is a search problem and I have tried a lot of different tools and strategies over the years to address this. I have written custom searching tools that dig into the various data sets I have and provide searchable metadata. I have also tried indexing my machine with google search and other third party tools, but over the years I’ve found the simplest solutions are often the best. We now have a simple database that searches for files of certain types, opens them and saves an image of them. This image along with the file location and a very limited set of metadata about the file is displayed on a internal web page. There is some very limited text searching available, but the main power of this database is to give a rapid visual idea of the what you might expect a particular data set to be good for.
It is quick, dirty and simple, but therein lies it’s power. It’s leveraging something the human brain is great at (rapid processing of images) to help us testers accomplish something. Sometimes I spend too much time on trying to come up with elegant complete solutions when a quick and simple solution will give most of value in a fraction of the time. I was struck by this again yesterday when trying to find some customer input data. The first thing I wanted to do was to construct some complex queries to find stuff in the support system, but instead I decided to whip together a quick and dirty script that would search through the shared drive for files with a particular set of keywords in their names. Will this miss a lot of potential matches? Of course! People often don’t name things as I would expect. But did I quickly find a bunch of data that I needed with a minimum investment of effort? Yup.
So I guess the moral of the story is that although there may be times when the complex solution is the answer, you might just be able to get by with a simple approach or tool instead. Keep it simple. Be efficient.