In a previous post, I wrote about the importance of storytelling in science. This is the other side of the story, the darker side of storytelling, where you decide the story in advance rather than letting your data guide it.

A few years ago, I travelled to DC for an NIH study section. As it sometimes happens, the day I flew into DC, I found myself with a free evening and I decided to go see a play. Since I only had one evening, I didn’t have the luxury of choosing. I just bought a ticket to whatever play was on that evening, which was a completely unknown play by a completely unknown (to me) play writer. It turned out, it was one of my most interesting theatrical experiences of my life.

The play was a murder mystery in three acts.  The first act introduces the characters and one of them gets killed. By the end of the second act, the detective work is in full swing and it turns out that all main characters would have had some reason to kill the victim. Nothing unusual so far. But then, in the break before the second and third act, somebody acting as the host polls the audience asking who do we think did it. And then, to my astonishment, the third act unfolds and shows us how the killer was exactly the person who was voted by the audience to have been the most likely killer.

broken mirror

It turned out that the play writer wrote several versions of the third act, one for each possible murderer, and the cast only needed to know which one was chosen by the audience in order to play the chosen version of the final act.

It was a fascinating and fun experience as a play-goer. Yet, whatever you do, make sure you do not end up doing this in your research. And it can be possible to end up in just this situation if you are not careful. It turns out that some bioinformatics platforms report tens or hundreds of “significant” findings for instance, biological processes. In those cases, the user has to scan through endless tables of biological processes and end up focusing on the few that they think make sense. This is not good science.

If the bioinformatics analysis is done right, it—not you—should identify the biological processes that are truly related to the underlying biology. If you end up selecting the ones that “make sense” out of a longer list, you are not discovering new phenomena, you are just confirming what you know, or want to hear.

Don’t let your intuition or your preconceived notions drive the results you choose. Don’t look into the mirror to decide what results are interesting. Let the data guide you . and use algorithms that take you straight to the culprit.

 

Want to learn more about how Advaita software can help you tell the story that’s in your data? Get in touch.

Categories

Analyze Now

  1. Register to explore demo data
  2. Subscribe to analyze your ‘omics data
  3. Review and interact with pathways impacted in your experiment
  4. Share your results with collaborators for interpretation and analysis iteration
  5. Create publication-ready figures simply and easily

What You Can Expect

  • Better Insights
  • Higher Quality
  • Superior Convenience
  • Unmatched Usability
  • Unparalleled Reproducibility
Register to Explore Advaita’s Platform with Demo Data

Get Started!

Get in touch with Advaita to learn how our software will improve quality and efficiency for your Core Facility, Enterprise Bioinformatics team, or Research Lab.

=

About the Author: Sorin Draghici

Sorin Draghici
Dr. Draghici is a Professor in the Department of Computer Science, and the head of the Intelligent Systems and Bioinformatics Laboratory at Wayne State University. He also holds a joint appointment in the Department Obstetrics and Gynecology and is an Associate Dean in Wayne State University's College of Engineering. Dr. Draghici is a senior member of IEEE, and an editor of IEEE/ACM Transactions on Computational Biology and Bioinformatics, Protocols in Bioinformatics, Discoveries Journals, Journal of Biomedicine and Biotechnology, and International Journal of Functional Informatics and Personalized Medicine. His publications include two books (”Data Analysis Tools for DNA Microarrays and Statistics” and ”Data Analysis for Microarrays using R”), 8 book chapters, and over 150 peer-reviewed journal and conference publications which gathered over 12,000 citations to date.