written by Eric J. Ma on 2016-12-01
This is the computational biologist who knows how to design, execute, and analyze wet lab experiments and the generated data. This is the wet lab scientist who knows how to develop software that crunches the massive amounts of data that her experiment generates. Does this person exist?
I'm going to make a bold statement, and then a bold prediction.
This person does not exist. And this person will become huge in demand in the biosciences in the future. And yet this person will turn out to be a team.
I think I'm qualified to make this statement. I've been on both sides of this dichotomy. I was a bench synthetic biologist, failed badly and went into computational evolutionary ecology, and then came back to the bench trying to marry together high throughout experimentation and deep learning. It's really, really tough to handle both at the same time.
Part of it is a matter of context switching. Computation really requires continuous stretches of iterating between coding and thinking. Experiments, by contrast, represent a continuous disruption to focus time. It’s impossible to juggle both in parallel. Moreover, both require month-long stretches in order to have momentum, during which one or the other gets lost. If I code, I lose momentum with my experiments. If I run experiments, I lose momentum with my code. There’s either a cognitive or energy cost to juggling both at a high performance level.
For me, it's a temperament issue. I learned during my three computation years developing algorithms that the quick turnover time of computation, to go from hypothesis to falsifying results, was something that I absolutely suited my temperament. I absolutely dreaded, and had no patience for the prospect of a failed hypothesis needing weeks to verify.
I also work best in "staggered sequential mode", not "parallelized mode". Staggered sequential means as the results of an experiment become easier to forecast, I start something new. Parallelized mode means juggling things together even when the results aren't easily forecastable. There's a subtle distinction there. Parallelized mode doesn't work for me, because I get very easily confused between project details.
With computation, that turnover time is often on the order of dozens of minutes max. With experiments, its weeks. With my temperament, I would much prefer the quick-turnover computational hypothesis testing rather than painstaking and slow rounds of hypothesis testing.
Yet, as a computation type, I face the dilemma all the time: I have a hypothesis for which I have no data to test it with. This means going back into the lab, which means confronting the very pace that my temperament would prefer not to face. Alternatively, I can find people with whom I can collaborate.
And that brings me to my point. I don't think there's such thing as the renaissance researcher. Rather, I think there is such a thing as a renaissance research team. Here, hypothesis generation and testing are a shared activity, and and the data generation and analysis are carried out by respective specialists. It's iterative, and it's undoubtedly going to be slower (at least initially) compared to going solo. But the rewards, I think, are going to be much greater once the team is in the state of ‘flow’.
In some ways, it reminds me of the problem with finding the ideal "data scientist" - one who could wrangle big amounts of data, make plots, infer conclusions, create interactive dashboards, and communicate the results to business executives. Nonsense, this unicorn doesn’t exist, and the industry quickly figured that out, thankfully, employing data science teams comprised of people with complementary skills.
@article{
ericmjl-2016-the-researcher,
author = {Eric J. Ma},
title = {The Renaissance Researcher},
year = {2016},
month = {12},
day = {01},
howpublished = {\url{https://ericmjl.github.io}},
journal = {Eric J. Ma's Blog},
url = {https://ericmjl.github.io/blog/2016/12/1/the-renaissance-researcher},
}
I send out a newsletter with tips and tools for data scientists. Come check it out at Substack.
If you would like to sponsor the coffee that goes into making my posts, please consider GitHub Sponsors!
Finally, I do free 30-minute GenAI strategy calls for teams that are looking to leverage GenAI for maximum impact. Consider booking a call on Calendly if you're interested!