I'm going to tell you a story. It's not a common story about research successes. This story is about a research project where I failed to find answers to questions, and why that's okay. Starting off, my research partner and I knew the project was a big ask, based on the open-ended question of whether there was a cause for the jump in SQL Injection (SQLi) attacks over the last 18-months.
Answering a question like this always involves a large-scale data exploration project. However, we didn't have any sort of project map to guide us. As junior researchers, we were still optimistic, despite the now seemingly obvious probability that we could fail. We were, as one manager put it, "trying to find a needle in a stack of needles."
Over the next few months, we spent time each week working with the data and diving deeper into our research.
One of the first major lessons learned was to always start a project with an initial checkpoint and timeframe. This is to prevent yourself from endlessly chasing down rabbit holes. The task of determining the cause of SQLi attacks was so broad, it was painstaking to narrow down ideas and develop more focused questions. It wasn't that we didn't find anything after spending hours researching and digging through the data; we did. However, we didn't find anything that conclusively explained why the spike occurred, and none of our findings would produce a data-driven, externally published blog post or report.
Another lesson we learned is that if you're getting burned out by an open-ended research project, and you're not actually close to a major breakthrough, perhaps it's not worth the energy and resources being spent. It's okay to call it a day and move on to the next project. Over time, you'll start to learn how to recognize this reality. We were up to the task, but too focused on success - or the pursuit of answers - to remember that failure is always an option. Failure is a fact of life when it comes to data science.
To further reinforce these ideas, I had the pleasure of commiserating with two of Akamai's Senior Researchers, Larry Cashdollar and Kaan Onarlioglu, about research "failures." Larry's takeaway lessons include time constraints for projects, because time is limited. Larry also emphasized that we needed to keep trying, as one has to fail to succeed.
Kaan spoke more at the philosophical level, as much of his background and experience comes from academia. He defines publishable results in research in two distinct ways; research can look at noisy data or information and find "truth" in that. The other way is that research can be creating knowledge out of nothing (this is harder).
Our industry tends to go with the first, which involves shorter-term efforts and more immediate results. Of course, it's making the implicit assumption that there is a truth somewhere to be found. The other is more academic, where it can be groundbreaking, but takes lots of time and resources.
Defining failure doesn't always mean not finding results, versus finding engaging results. Publishing doesn't necessarily correlate with quality of research conducted, so you should not confuse the two, as they involve separate skill sets. To put it another way, failing to publish does not equate to being a failure as a researcher. Moreover, failing to find concrete answers to a question doesn't mean you've failed either.
For exploratory research, the process tends to improve by doing, and gaining experience over time. With a lot of research, there isn't always a roadmap and you need a lot of flexibility, so an agile approach doesn't always work. It isn't easy to define and control a project. There's always something to be found, but it's a matter of how much time you can put in. Research in general is an iterative process, so data exploration can be a preferred approach - find out what you can, share any findings, and then keep researching and building upon it.
This was at least my first major professional experience of research coming up dry - I've been told I am lucky this is the case. Sometimes you really don't find anything, except that you've hopefully still learned some lessons or insights in the process. In data exploration, there are no guarantees, and you don't know where it's going to take you.
Finally, one of the most important lessons repeatedly reinforced to a pair of disappointed researchers was that this was not a mistake we made, or that we "failed" in our task thoroughly. We did our best to explore the data, and so the time spent was not a waste. It's just that the findings simply weren't there this time... but maybe next time they will be.