We are a leader in the interpretation of high-throughput biomedical data including variant interpretation, pathway analysis, disease subtype discovery and integration of multiple data types. Currently, 9 of the top 10 pharma companies rely on Advaita’s state-of-the-art algorithms to solve complex problems. Advaita provides a suite of advanced analysis software to more that 13,000 registered users worldwide: iPathwayGuide, for functional interpretation of genes and proteins; iVariantGuide, for genetic variant analysis; and iBioGuide, a search engine revealing connections between genes, pathways, SNPs, drugs, and more.
Our mission is to bridge the gap between the ability to collect and interpret biological data.
Advaita Bioinformatics develops bioinformatics data analysis tools with capabilities to integrate multi-‘omic’ data from a systems biology perspective. Our proprietary computational method, Impact Analysis, considers the type, function, position and interaction between genes in a pathway as opposed to other approaches that consider genes to be independent.
Through our $2.2 million NIH STTR Phase II award, we have developed a software platform called Pathway-Guide for the analysis of data from high-throughput microarray and next-generation sequencing experiments. Our next generation product, iPathwayGuide, is a web-based pathway analysis application based on our Impact Analysis method. We are the only company in the industry that includes other crucial factors about the genes in addition to classical statistics. iPathwayGuide includes unique capabilities such modeling SNPs, miRNAs and drugs directly on the pathways and predicting miRNA potentially regulating gene expression in your samples based only on mRNA data. Our vision is to create a platform for multi-omic integration and analysis from a systems biology level.
Advaita Bioinformatics is also working on developing innovative tools for subtyping of diseases and patients. These tools address the classic “responder” vs. “non-responder” problem facing 50% of drugs that reach Phase 3 clinical trials. With our technology, we can identify truly homogeneous populations and the mechanisms that distinguish them from other populations.
Our vision is to:
reduce healthcare costs by facilitating efficient decision making for drug development
advance personalized medicine by identifying patients with similar underlying disease mechanisms
empower researchers to make breakthrough discoveries by analyzing multi-omic data from a systems biology perspective
# of Citations
Ontological analysis of gene expression data: current tools, limitations, and open problems.
Bioinformatics 21 (18), 3587-3595
A systems biology approach for pathway level analysis.
Genome Research, 2007, Vol. 17 (10), pages 1537-1545.
Global functional profiling of gene expression.
Genomics 81 (2), 98-104
Reliability and reproducibility issues in DNA microarray measurements.
TRENDS in Genetics 22 (2), 101-109
Data analysis tools for DNA microarrays.
(Book) CRC Press
Profiling gene expression using onto-express.
Genomics 79 (2), 266-270
Onto-tools, the toolkit of the modern biologist: onto-express, onto-compare, onto-design and onto-translate.
Nucleic acids research 31 (13), 3775-3781
Use and misuse of the gene ontology annotations.
Nature Reviews Genetics 9 (7), 509-515
Onto-Tools: New Additions and Improvements in 2006.
Nucleic Acids Research, Vol. 35, pages W206-W211, July 2007.
Statistics and data analysis for microarrays using R and bioconductor.
(Book) CRC Press
Analysis and correction of crosstalk effects in pathway analysis.
Genome Research, 2013, Vol. 23 (9).
A system biology approach for the steady-state analysis of gene signaling networks.
In Proceedings of the Congress on pattern recognition 12th Iberoamerican conference on Progress in pattern recognition, image analysis and applications (CIARP’07).
Bober, P., Tomková, Z., Alexovič, M., Ropovik, I., & Sabo, J. (2019). The unfolded protein response controls endoplasmic reticulum stress-induced apoptosis of MCF-7 cells via a high dose of vitamin C treatment. Molecular biology reports, 1-10.
Renaud, L., Agarwal, N., Richards, D.J., Falcinelli, S., Hazard, E.S., Carnevali, O., Hyde, J. and Hardiman, G., 2019. Transcriptomic analysis of short-term 17α-ethynylestradiol exposure in two Californian sentinel fish species sardine (Sardinops sagax) and mackerel (Scomber japonicus). Environmental pollution, 244, pp.926-937.
Fuentes-González, A.M., Muñoz-Bello, J.O., Manzo-Merino, J., Contreras-Paredes, A., Pedroza-Torres, A., Fernández-Retana, J., Pérez-Plasencia, C. and Lizano, M., 2019. Intratype variants of the E2 protein from human papillomavirus type 18 induce different gene expression profiles associated with apoptosis and cell proliferation. Archives of virology, pp.1-14.
Chakraborty, P., Kuo, R., Vervelde, L., Dutia, B. M., Kaiser, P., & Smith, J. (2019). Macrophages from Susceptible and Resistant Chicken Lines have Different Transcriptomes following Marek’s Disease Virus Infection. Genes, 10(2), 74.
Schubert, M. F., Noah, A. C., Bedi, A., Gumucio, J. P., & Mendias, C. L. (2019). Reduced Myogenic and Increased Adipogenic Differentiation Capacity of Rotator Cuff Muscle Stem Cells. JBJS, 101(3), 228-238.
Bountali, A., Tonge, D. P., & Mourtada-Maarabouni, M. (2019). RNA sequencing reveals a key role for the long non-coding RNA MIAT in regulating neuroblastoma and glioblastoma cell fate. International journal of biological macromolecules.
Bradford, S. T., Ranghini, E. J., Grimley, E., Lee, P. H., & Dressler, G. R. (2019). High-throughput screens for agonists of bone morphogenetic protein (BMP) signaling identify potent benzoxazole compounds. Journal of Biological Chemistry, 294(9), 3125-3136.
Riemondy, K.A., Jansing, N.L., Jiang, P., Redente, E.F., Gillen, A.E., Fu, R., Miller, A.J., Spence, J.R., Gerber, A.N., Hesselberth, J.R. and Zemans, R.L., 2019. Single cell RNA sequencing identifies TGFβ as a key regenerative cue following LPS-induced lung injury. JCI insight.
Valianou, M., Filippidou, N., Johnson, D.L., Vogel, P., Zhang, E.Y., Liu, X., Lu, Y., Jane, J.Y., Bissler, J.J. and Astrinidis, A., 2019. Rapalog resistance is associated with mesenchymal-type changes in Tsc2-null cells. Scientific reports, 9(1), p.3015.
Mathis, N. J., Adaniya, E. N., Smith, L. M., Robling, A. G., Jepsen, K. J., & Schlecht, S. H. (2019). Differential changes in bone strength of two inbred mouse strains following administration of a sclerostin-neutralizing antibody during growth. PloS one, 14(4), e0214520.
Correll, R. N., Grimes, K. M., Prasad, V., Lynch, J. M., Khalil, H., & Molkentin, J. D. (2019). Overlapping and differential functions of ATF6α versus ATF6β in the mouse heart. Scientific reports, 9(1), 2059.
Hariharan, K., Stachelscheid, H., Rossbach, B., Oh, S.J., Mah, N., Schmidt-Ott, K., Kurtz, A. and Reinke, P., 2019. Parallel generation of easily selectable multiple nephronal cell types from human pluripotent stem cells. Cellular and Molecular Life Sciences, 76(1), pp.179-192.
Liu, Y., Lang, T., Jin, B., Chen, F., Zhang, Y., Beuerman, R.W., Zhou, L. and Zhang, Z., 2017. Luteolin inhibits colorectal cancer cell epithelial-to-mesenchymal transition by suppressing CREB1 expression revealed by comparative proteomics study. Journal of proteomics, 161, pp.1-10.
Han, K., Lang, T., Zhang, Z., Zhang, Y., Sun, Y., Shen, Z., Beuerman, R.W., Zhou, L. and Min, D., 2018. Luteolin attenuates Wnt signaling via upregulation of FZD6 to suppress prostate cancer stemness revealed by comparative proteomics. Scientific reports, 8(1), p.8537.
Cortes-Selva, D., Elvington, A. F., Ready, A., Rajwa, B., Pearce, E. J., Randolph, G. J., & Fairfax, K. C. (2018). Schistosoma mansoni infection-induced transcriptional changes in hepatic macrophage metabolism correlate with an athero-protective phenotype. Frontiers in Immunology, 9.
Tang, Q., Zhang, C., Wu, X., Duan, W., Weng, W., Feng, J., Mao, Q., Chen, S., Jiang, J. and Gao, G., 2018. Comprehensive proteomic profiling of patients’ tears identifies potential biomarkers for the traumatic vegetative state. Neuroscience bulletin, 34(4), pp.626-638.
Cardoso, T.F., Quintanilla, R., Tibau, J., Gil, M., Mármol-Sánchez, E., González-Rodríguez, O., González-Prendes, R. and Amills, M., 2017. Nutrient supply affects the mRNA expression profile of the porcine skeletal muscle. BMC genomics, 18(1), p.603.
Basu, A., Munir, S., Mulaw, M.A., Singh, K., Herold, B., Crisan, D., Sindrilaru, A., Treiber, N., Wlaschek, M., Huber-Lang, M. and Gebhard, F., 2018. A novel S100A8/A9 induced fingerprint of mesenchymal stem cells associated with enhanced wound healing. Scientific reports, 8(1), p.6205.
Ogura, Kohei, Kayo Okumura, Yukiko Shimizu, Teruo Kirikae, and Tohru Miyoshi-Akiyama. “Pathogenicity induced by invasive infection of Streptococcus dysgalactiae subsp. equisimilis in a mouse model of diabetes.” Frontiers in microbiology, 9 (2018).
Ortea, I., González-Fernández, M. J., Ramos-Bueno, R. P., & Guil-Guerrero, J. L. (2018). Proteomics study reveals that docosahexaenoic and arachidonic acids exert different in vitro anticancer activities in colorectal cancer cells. Journal of agricultural and food chemistry, 66(24), 6003-6012.
Shi, J., Wang, X., Lyu, L., Jiang, H., & Zhu, H. J. (2018). Comparison of protein expression between human livers and the hepatic cell lines HepG2, Hep3B, and Huh7 using SWATH and MRM-HR proteomics: Focusing on drug-metabolizing enzymes. Drug metabolism and pharmacokinetics, 33(2), 133-140.
Kadzielawa, K., Mathew, B., Stelman, C. R., Lei, A. Z., Torres, L., & Roth, S. (2018). Gene expression in retinal ischemic post-conditioning. Graefe’s Archive for Clinical and Experimental Ophthalmology, 256(5), 935-949.
Shan, S.W., Tse, D.Y.Y., Zuo, B., To, C.H., Liu, Q., McFadden, S.A., Chun, R.K.M., Bian, J., Li, K.K. and Lam, T.C., 2018. Integrated SWATH-based and targeted-based proteomics provide insights into the retinal emmetropization process in guinea pig. Journal of proteomics, 181, pp.1-15.
Jokinen, V., Sidorova, Y., Viisanen, H., Suleymanova, I., Tiilikainen, H., Li, Z., Lilius, T.O., Mätlik, K., Anttila, J.E., Airavaara, M. and Tian, L., 2018. Differential Spinal and supraspinal activation of glia in a rat model of morphine tolerance. Neuroscience, 375, pp.10-24.
Kaur, G., Helmer, R. A., Smith, L. A., Martinez-Zaguilan, R., Dufour, J. M., & Chilton, B. S. (2018). Alternative splicing of helicase-like transcription factor (Hltf): Intron retention-dependent activation of immune tolerance at the feto-maternal interface. PloS one, 13(7), e0200211.
Burger, L. L., Vanacker, C., Phumsatitpong, C., Wagenmaker, E. R., Wang, L., Olson, D. P., & Moenter, S. M. (2018). Identification of genes enriched in GnRH neurons by translating ribosome affinity purification and RNAseq in mice. Endocrinology, 159(4), 1922-1940.
Haenfler, J. M., Skariah, G., Rodriguez, C. M., Monteiro da Rocha, A., Parent, J. M., Smith, G. D., & Todd, P. K. (2018). Targeted reactivation of fmr1 transcription in fragile x syndrome embryonic stem cells. Frontiers in molecular neuroscience, 11, 282.
Wagner, S., Ball, G. R., Pockley, A. G., & Miles, A. K. (2018). Application of omic technologies in cancer research. Translational Medicine Reports, 2(1).
Li, J., Wang, X., Ackerman, W., Batty, A., Kirk, S., White, W., Wang, X., Anastasakis, D., Samavati, L., Buhimschi, I. and Nelin, L., 2018. Dysregulation of Lipid Metabolism in Mkp-1 Deficient Mice during Gram-Negative Sepsis. International journal of molecular sciences, 19(12), p.3904.
Parafati, M., Kirby, R. J., Khorasanizadeh, S., Rastinejad, F., & Malany, S. (2018). A nonalcoholic fatty liver disease model in human induced pluripotent stem cell-derived hepatocytes, created by endoplasmic reticulum stress-induced steatosis. Disease models & mechanisms, 11(9), dmm033530.
Avolio, R., Järvelin, A.I., Mohammed, S., Agliarulo, I., Condelli, V., Zoppoli, P., Calice, G., Sarnataro, D., Bechara, E., Tartaglia, G.G. and Landriscina, M., 2018. Protein Syndesmos is a novel RNA-binding protein that regulates primary cilia formation. Nucleic acids research, 46(22), pp.12067-12086.
Mrowczynski, O.D., Madhankumar, A.B., Sundstrom, J.M., Zhao, Y., Kawasawa, Y.I., Slagle-Webb, B., Mau, C., Payne, R.A., Rizk, E.B., Zacharia, B.E. and Connor, J.R., 2018. Exosomes impact survival to radiation exposure in cell line models of nervous system cancer. Oncotarget, 9(90), p.36083.
Zeinali, M., Murlidhar, V., Fouladdel, S., Shao, S., Zhao, L., Cameron, H., Bankhead III, A., Shi, J., Cuneo, K.C., Sahai, V. and Azizi, E., 2018. Profiling Heterogeneous Circulating Tumor Cells (CTC) Populations in Pancreatic Cancer Using a Serial Microfluidic CTC Carpet Chip. Advanced Biosystems, 2(12), p.1800228.
Rücker, F. G., Dolnik, A., Blätte, T. J., Teleanu, V., Ernst, A., Thol, F., … & Bullinger, L. (2018). Chromothripsis is linked to TP53 alteration, cell cycle impairment, and dismal outcome in acute myeloid leukemia with complex karyotype. haematologica, 103(1), e17-e20.
Zhao, X., Liao, Y., Morgan, S., Mathur, R., Feustel, P., Mazurkiewicz, J., Qian, J., Chang, J., Mathern, G.W., Adamo, M.A. and Ritaccio, A.L., 2018. Noninflammatory changes of microglia are sufficient to cause epilepsy. Cell reports, 22(8), pp.2080-2093.
Kowtharapu, B., Prakasam, R., Murín, R., Koczan, D., Stahnke, T., Wree, A., Jünemann, A. and Stachs, O., 2018. Role of Bone Morphogenetic Protein 7 (BMP7) in the Modulation of Corneal Stromal and Epithelial Cell Functions. International journal of molecular sciences, 19(5), p.1415.
Manigrasso, M. B., Friedman, R. A., Ramasamy, R., D’Agati, V., & Schmidt, A. M. (2018). Deletion of the formin Diaph1 protects from structural and functional abnormalities in the murine diabetic kidney. American Journal of Physiology-Renal Physiology, 315(6), F1601-F1612.
Lee, J., Arisi, I., Puxeddu, E., Mramba, L.K., Amicosante, M., Swaisgood, C.M., Pallante, M., Brantly, M.L., Sköld, C.M. and Saltini, C., 2018. Bronchoalveolar lavage (BAL) cells in idiopathic pulmonary fibrosis express a complex pro-inflammatory, pro-repair, angiogenic activation pattern, likely associated with macrophage iron accumulation. PloS one, 13(4), p.e0194803.
Gallotta, M., Assi, H., Degagné, É., Kannan, S. K., Coffman, R. L., & Guiducci, C. (2018). Inhaled TLR9 Agonist Renders Lung Tumors Permissive to PD-1 Blockade by Promoting Optimal CD4+ and CD8+ T-cell Interplay. Cancer research, 78(17), 4943-4956.
Creeth, H.D., McNamara, G.I., Tunster, S.J., Boque-Sastre, R., Allen, B., Sumption, L., Eddy, J.B., Isles, A.R. and John, R.M., 2018. Maternal care boosted by paternal imprinting in mammals. PLoS biology, 16(7), p.e2006599.
Tjitro, R., Campbell, L. A., Basova, L., Johnson, J., Najera, J. A., Lindsey, A., & Marcondes, M. C. G. (2018). Modeling the Function of TATA Box Binding Protein in Transcriptional Changes Induced by HIV-1 Tat in Innate Immune Cells and the Effect of Methamphetamine Exposure. Frontiers in immunology, 9.
Huang, Q., Sun, M. A., & Yan, P. (2018). Pathway and Network Analysis of Differentially Expressed Genes in Transcriptomes. In Transcriptome Data Analysis (pp. 35-55). Humana Press, New York, NY.
Flores, B. N., Li, X., Malik, A. M., Martinez, J., Beg, A. A., & Barmada, S. J. (2019). An Intramolecular Salt Bridge Linking TDP43 RNA Binding, Protein Stability, and TDP43-Dependent Neurodegeneration. Cell Reports, 27(4), 1133-1150.
Vishnoi, M., Boral, D., Liu, H., Sprouse, M.L., Yin, W., Goswami-Sewell, D., Tetzlaff, M.T., Davies, M.A., Oliva, I.C.G. and Marchetti, D., 2018. Targeting USP7 Identifies a Metastasis-Competent State within Bone Marrow–Resident Melanoma CTCs. Cancer research, 78(18), pp.5349-5362.
Araújo, T., Khayat, A., Quintana, L., Calcagno, D., Mourão, R., Modesto, A., Paiva, J., Lima, A., Moreira, F., Oliveira, E. and Souza, M., 2018. Piwi like RNA-mediated gene silencing 1 gene as a possible major player in gastric cancer. World journal of gastroenterology, 24(47), 5338.
Renaud, L., da Silveira, W.A., Glen, J., William, B., Hazard, E.S. and Hardiman, G., 2018. Interplay Between MicroRNAs and Targeted Genes in Cellular Homeostasis of Adult Zebrafish (Danio rerio). Current genomics, 19(7), pp.615-629.
Iosef, C., Liu, M., Ying, L., Rao, S.P., Concepcion, K.R., Chan, W.K., Oman, A. and Alvira, C.M., 2018. Distinct roles for IκB kinases alpha and beta in regulating pulmonary endothelial angiogenic function during late lung development. Journal of cellular and molecular medicine, 22(9), pp.4410-4422.
Ock, S., Ahn, J., Lee, S.H., Kim, H.M., Kang, H., Kim, Y.K., Kook, H., Park, W.J., Kim, S., Kimura, S. and Jung, C.K., 2018. Thyrocyte‐specific deletion of insulin and IGF‐1 receptors induces papillary thyroid carcinoma‐like lesions through EGFR pathway activation. International Journal of Cancer, 143(10), pp.2458-2469.
Renz, B.W., Tanaka, T., Sunagawa, M., Takahashi, R., Jiang, Z., Macchini, M., Dantes, Z., Valenti, G., White, R.A., Middelhoff, M.A. and Ilmer, M., 2018. Cholinergic Signaling via Muscarinic Receptors Directly and Indirectly Suppresses Pancreatic Tumorigenesis and Cancer Stemness. Cancer discovery, 8(11), pp.1458-1473.
Jiang, J., Shihan, M. H., Wang, Y., & Duncan, M. K. (2018). Lens Epithelial Cells Initiate an Inflammatory Response Following Cataract Surgery. Investigative ophthalmology & visual science, 59(12), 4986-4997.
Colacino, J.A., Azizi, E., Brooks, M.D., Harouaka, R., Fouladdel, S., McDermott, S.P., Lee, M., Hill, D., Madden, J., Boerner, J. and Cote, M.L., 2018. Heterogeneity of human breast stem and progenitor cells as revealed by transcriptional profiling. Stem cell reports, 10(5), pp.1596-1609.
Phillips, E. H., Lorch, A. H., Durkes, A. C., & Goergen, C. J. (2018). Early pathological characterization of murine dissecting abdominal aortic aneurysms. APL Bioengineering, 2(4), 046106.
Hernandez, C., Huebener, P., Pradere, J. P., Antoine, D. J., Friedman, R. A., & Schwabe, R. F. (2018). HMGB1 links chronic liver injury to progenitor responses and hepatocarcinogenesis. The Journal of clinical investigation, 128(6).
Jouan, Y., Patin, E. C., Hassane, M., Si-Tahar, M., Baranek, T., & Paget, C. (2018). Thymic program directing the functional development of γδT17 cells. Frontiers in immunology, 9.
Menon, R., Otto, E.A., Kokoruda, A., Zhou, J., Zhang, Z., Yoon, E., Chen, Y.C., Troyanskaya, O., Spence, J.R., Kretzler, M. and Cebrian, C., 2018. Single-cell analysis of progenitor cell dynamics and lineage specification in the human fetal kidney. Development, 145(16), p.dev164038.
Fu, X., Khalil, H., Kanisicak, O., Boyer, J.G., Vagnozzi, R.J., Maliken, B.D., Sargent, M.A., Prasad, V., Valiente-Alandi, I., Blaxall, B.C. and Molkentin, J.D., 2018. Specialized fibroblast differentiated states underlie scar formation in the infarcted mouse heart. The Journal of clinical investigation, 128(5).
Luo, M., Shang, L., Brooks, M.D., Jiagge, E., Zhu, Y., Buschhaus, J.M., Conley, S., Fath, M.A., Davis, A., Gheordunescu, E. and Wang, Y., 2018. Targeting breast cancer stem cell state equilibrium through modulation of redox signaling. Cell metabolism, 28(1), pp.69-86.
Foote, A.P., Keel, B.N., Zarek, C.M. and Lindholm-Perry, A.K., 2017. Beef steers with average dry matter intake and divergent average daily gain have altered gene expression in the jejunum. Journal of Animal Science.
Worthington, R., Ball, E., Wolf, B. and Takacs, G., 2017. Method to Identify Silent Codon Mutations That May Alter Peptide Elongation Kinetics and Co-translational Protein Folding. In Proteomics for Drug Discovery (pp. 237-243). Humana Press, New York, NY.
Kumar, A., Bicer, E.M., Pfeffer, P., Monopoli, M.P., Dawson, K.A., Eriksson, J., Edwards, K., Lynham, S., Arno, M., Behndig, A.F. and Blomberg, A., 2017. Differences in the coronal proteome acquired by particles depositing in the lungs of asthmatic versus healthy humans. Nanomedicine: Nanotechnology, Biology and Medicine.
Lin, C.K.E., Kaptein, J.S. and Sheikh, J., 2017. Differential expression of microRNAs and their possible roles in patients with chronic idiopathic urticaria and active hives. Allergy & Rhinology, 8(2), pp.e67-e80.
Kumar, A., Terakosolphan, W., Hassoun, M., Vandera, K.K., Novicky, A., Harvey, R., Royall, P.G., Bicer, E.M., Eriksson, J., Edwards, K. and Valkenborg, D., 2017. A Biocompatible Synthetic Lung Fluid Based on Human Respiratory Tract Lining Fluid Composition. Pharmaceutical Research, pp.1-12.
Liu, Y., Lang, T., Jin, B., Chen, F., Zhang, Y., Beuerman, R.W., Zhou, L. and Zhang, Z., 2017. Luteolin inhibits colorectal cancer cell epithelial-to-mesenchymal transition by suppressing CREB1 expression revealed by comparative proteomics study. Journal of Proteomics.
Schatton, D., Pla-Martin, D., Marx, M.C., Hansen, H., Mourier, A., Nemazanyy, I., Pessia, A., Zentis, P., Corona, T., Kondylis, V. and Barth, E., 2017. CLUH regulates mitochondrial metabolism by controlling translation and decay of target mRNAs. J Cell Biol, pp.jcb-201607019.
Todorova, K., Metodiev, M.V., Metodieva, G., Mincheff, M., Fernández, N. and Hayrabedyan, S., 2016. Micro-RNA-204 Participates in TMPRSS2/ERG Regulation and Androgen Receptor Reprogramming in Prostate Cancer. Hormones and Cancer, pp.1-21.
Wang, S., Campos, J., Gallotta, M., Gong, M., Crain, C., Naik, E., Coffman, R.L. and Guiducci, C., 2016. Intratumoral injection of a CpG oligonucleotide reverts resistance to PD-1 blockade by expanding multifunctional CD8+ T cells. Proceedings of the National Academy of Sciences, p.201608555.
Simonik, E.A., Cai, Y., Kimmelshue, K.N., Brantley-Sieders, D.M., Loomans, H.A., Andl, C.D., Westlake, G.M., Youngblood, V.M., Chen, J., Yarbrough, W.G. and Brown, B.T., 2016. LIM-Only Protein 4 (LMO4) and LIM Domain Binding Protein 1 (LDB1) Promote Growth and Metastasis of Human Head and Neck Cancer (LMO4 and LDB1 in Head and Neck Cancer). PloS one, 11(10), p.e0164804.
Wadhwa, R., Nigam, N., Bhargava, P., Dhanjal, J.K., Goyal, S., Grover, A., Sundar, D., Ishida, Y., Terao, K. and Kaul, S.C., 2016. Molecular Characterization and Enhancement of Anticancer Activity of Caffeic Acid Phenethyl Ester by γ Cyclodextrin.Journal of Cancer, 7(13), pp.1755-1771.
Zhou, H., Manthey, J., Lioutikova, E., Yang, W., Yoshigoe, K., Yang, M.Q. and Wang, H., 2016. The up-regulation of Myb may help mediate EGCG inhibition effect on mouse lung adenocarcinoma. Human Genomics, 10(2), p.103.
Klener, P., Fronkova, E., Berkova, A., Jaksa, R., Lhotska, H., Forsterova, K., Soukup, J., Kulvait, V., Vargova, J., Fiser, K. and Prukova, D., 2016. Mantle cell lymphoma‐variant Richter syndrome: Detailed molecular‐cytogenetic and backtracking analysis reveals slow evolution of a pre‐MCL clone in parallel with CLL over several years. International Journal of Cancer.
Colacino, J.A., McDermott, S.P., Sartor, M.A., Wicha, M.S. and Rozek, L.S., 2016. Transcriptomic profiling of curcumin-treated human breast stem cells identifies a role for stearoyl-coa desaturase in breast cancer prevention.Breast Cancer Research and Treatment, pp.1-13.
Kravchenko, D.S., Lezhnin, Y.N., Kravchenko, J.E., Chumakov, S.P. and Frolova, E.I., 2016. Study of Molecular Mechanisms of PDLIM4/RIL in Promotion of the Development of Breast Cancer. Biol Med (Aligarh), 8(2), p.2.
Mitt, M., Altraja, A. and Altraja, S., 2016. Altered Gene Expression Profiles In Human Bronchial Epithelial Cells Exposed To E-Cigarette Liquid: Results From A Genome-Wide Monitoring. In B58. BIG AND BIGGER (DATA): OMICS AND BIOMARKERS OF COPD AND OTHER CHRONIC LUNG DISEASES (pp. A4053-A4053). American Thoracic Society.
Na, Y., Kaul, S.C., Ryu, J., Lee, J.S., Ahn, H.M., Kaul, Z., Kalra, R.S., Li, L., Widodo, N., Yun, C.O. and Wadhwa, R., 2016. Stress chaperone mortalin contributes to epithelial-mesenchymal transition and cancer metastasis.Cancer research, pp.canres-2704.
Westphalen, C.B., Takemoto, Y., Tanaka, T., Macchini, M., Jiang, Z., Renz, B.W., Chen, X., Ormanns, S., Nagar, K., Tailor, Y. and May, R., 2016. Dclk1 Defines Quiescent Pancreatic Progenitors that Promote Injury-Induced Regeneration and Tumorigenesis. Cell Stem Cell, 18(4), pp.441-455.
Eddens, T., Campfield, B.T., Serody, K., Manni, M.L., Horne, W., Elsegeiny, W., McHugh, K.J., Pociask, D., Chen, K., Zheng, M. and Alcorn, J.F., 2016. A Novel CD4+ T-cell Dependent Murine Model of Pneumocystis Driven Asthma-like Pathology. American Journal of Respiratory And Critical Care Medicine, (ja).
Takeda, K., Sriram, S., Chan, X.H.D., Ong, W.K., Yeo, C.R., Tan, B., Lee, S.A., Kong, K.V., Hoon, S., Jiang, H. and Yuen, J.J., 2016. Retinoic Acid Mediates Visceral-specific Adipogenic Defects of Human Adipose-derived Stem Cells. Diabetes, p.db151315.
Williams, K.E., Lemieux, G.A., Hassis, M.E., Olshen, A.B., Fisher, S.J. and Werb, Z., 2016. Quantitative proteomic analyses of mammary organoids reveals distinct signatures after exposure to environmental chemicals.Proceedings of the National Academy of Sciences, p.201600645.
Ortea, I., Rodríguez-Ariza, A., Chicano-Gálvez, E., Vacas, M.A. and Gámez, B.J., 2016. Discovery of potential protein biomarkers of lung adenocarcinoma in bronchoalveolar lavage fluid by SWATH MS data-independent acquisition and targeted data extraction. Journal of Proteomics. 2016 Feb 18.
Lamontagne, J., Mell, J.C. and Bouchard, M.J., 2016. Transcriptome-Wide Analysis of Hepatitis B Virus-Mediated Changes to Normal Hepatocyte Gene Expression. PLoS Pathog, 12(2), p.e1005438.
Zhou, H., Manthey, J., Lioutikova, E., Yang, M.Q., Yang, W., Yoshigoe, K. and Wang, H., 2015, November. The upregulation of Myb and Peg3 may mediate EGCG inhibition effect on mouse lung adenocarcinoma. In Bioinformatics and Biomedicine (BIBM), 2015 IEEE International Conference on (pp. 1532-1535). IEEE.
Andres-Terre, M., McGuire, H.M., Pouliot, Y., Bongen, E., Sweeney, T.E., Tato, C.M. and Khatri, P., 2015. Integrated, Multi-cohort Analysis Identifies Conserved Transcriptional Signatures across Multiple Respiratory Viruses.Immunity, 43(6), pp.1199-1211.
Srivastava, A., Ritesh, K.C., Tsan, Y.C., Liao, R., Su, F., Cao, X., Hannibal, M.C., Keegan, C.E., Chinnaiyan, A.M., Martin, D.M. and Bielas, S.L., 2015. De novo Dominant ASXL3 Mutations Alter H2A Deubiquitination and Transcription in Bainbridge-Ropers Syndrome. Human molecular genetics, p.ddv499.
Lee, S.E., Son, G.W., Park, H.R., Jin, Y.H., Park, C.S. and Park, Y.S., 2015. Integrative analysis of miRNA and mRNA profiles in response to myricetin in human endothelial cells. BioChip Journal, 9(3), pp.239-246.
Sanford, T., Welty, C., Meng, M. and Porten, S., 2015. MP68-18 MOLECULAR ANALYSIS OF UROTHELIAL TUMORS IN PATIENTS WITH AND WITHOUT METASTASIS STRATIFIED BY T STAGE. The Journal of Urology, 193(4), p.e865.
Most existing pathway analysis methods focus on either the number of differentially expressed genes observed in a given pathway (enrichment analysis methods), or on the correlation between the pathway genes and the class of the samples (functional class scoring methods). Both approaches treat pathways as simple sets of genes, disregarding the complex gene interactions that these pathways are built to describe.
More recently, biological annotations have started to include descriptions of gene interactions in the form of gene signaling networks, such as KEGG (Ogata et al., 1999), BioCarta (www.biocarta.com) and Reactome (Joshi-Tope et al., 2005). This richer type of annotations have opened the possibility of an automatic analysis aimed to identify the gene signaling networks that are relevant in a given condition, and perhaps even the specific signals or signal perturbations involved. This approach is not well suited for a systems biology approach that aims to account for system-level dependencies and interactions, as well as identify perturbations and modifications at the pathway or organism level (Stelling, 2004).
Advaita’s products are based on Impact Analysis method that leverages the information about type, function, position and interaction between genes in a given pathway. Impact Analysis combines the evidence obtained from the classical enrichment analysis with a novel type of evidence, which measures the actual perturbation on a given pathway under a given condition. We illustrate the capabilities of the novel method on four real datasets. The results obtained on these data show that Impact Analysis has better specificity and more sensitivity than several widely used pathway analysis methods.
Hi there. Advaita is dedicated to bringing you the most advanced, easiest-to-use Bioinformatics tools out there. And that includes educational materials designed to help you take advantage of all the powerful features we offer. Our last post about p-value correction factors was a bit confusing. This blog post explains how each method works, so you can decide when to use each one.
We are lucky to have a few bioinformaticians around the office, including Dr. Sorin Draghici our CEO and founder. If you don’t have a bioinformatics expert in-house, you might want to pick up his book. It’s full of useful information, and I think the best part about it is how easy it is to read— he makes it fun! For now, if you want to know more about getting the most from your analyses, read on…
A p-value represents the probability of observing an event by random chance. For example, if there are 5 differentially expressed (DE) genes on pathway X out of 100 DE genes in the dataset, the over-enrichment p-value for pathway X is the probability that from a randomly selected set 100 genes in the dataset, 5 or more fall on pathway X. Significance is determined by setting a threshold, in many cases 0.05. If the p-value is less than 0.05, pathway X is considered significant because the chance of randomly observing the same result is less than 5%.
This means that there is still a chance that the observation was in fact due to randomness and pathway X is not significant, what we would call a “false positive.” The chance of pathway X being a false positive is small, but when we perform this test multiple times as we would for multiple pathways, the chance of reporting at least one false positive increases quickly. That is because the probability of reporting a false positive in a group of independent tests is the sum of the individual p-values. When this is done for hundreds of pathways, we are virtually guaranteed to have some pathways that appear to be significant just by chance. This is known as the “multiple comparisons problem,” and we tell you how to correct for it in the first section.
Enrichment tests are used in a number of settings including enrichment pathway analysis  and gene ontology (GO) enrichment analysis. However, the GO has an additional structure that includes a hierarchical organization of its terms, as well as a “true path rule” that allows genes to be associated with entire paths through the ontology, rather than single terms . Because of these additional properties, specific enrichment analysis methods (and associated multiple comparison strategies) have been developed for GO enrichment analysis. Two of these methods will be briefly discussed in the second section.
I. Methods of Correcting for Multiple Comparisons
General methods for multiple comparison corrections may be applied to any enrichment analysis. There are two strategies to limit the number of false positives across a large number of significance tests, and several methods have been developed for each strategy.
Strategy 1. Limit the probability of making a mistake (reporting a false positive) for each individual test
Strategy 2. Limit the rate of false positives, i.e. the proportion of false positive tests
In iPathwayGuide and iVariantGuide, we offer the most widely-cited method for each strategy. Furthermore, the methods we chose provide a range of stringency so that you can choose what is appropriate for your data. Try it out now!
The Bonferroni correction is considered to be the most conservative method to correct for multiple comparisons, meaning that the fewest false positives are returned. The drawback is that some truly meaningful events may not be reported as significant. The Bonferroni method guarantees that the chance of any individual test yielding a false positive is less than the chosen significance threshold [3,4]. In other words, for a 5% significance threshold, the Bonferroni correction guarantees that the probability of generating at least one false positive is less than 5%. The more tests we run, the smaller the individual (raw) p-values must be for them to remain significant after the Bonferroni correction.
False Discovery Rate
In contrast to Bonferroni, FDR is one of the most lenient methods, allowing more true positives to be reported as significant with the drawback that some false positives may also be reported as such. Developed by Benjamini and Hochberg, FDR correction guarantees that the proportion of false positive tests will be smaller than the original significance threshold [5,6]. In other words, for a 5% significance threshold, FDR correction guarantees that the proportion of false positives is less than 5% of the total number of positive tests.
II. Multiple Comparisons in GO enrichment analysis
Due to the True Path Rule, genes associated with a GO term are also associated with its parent terms (for more on this, see Chapter 22 of Dr. Draghici’s book ). This means that simply performing an enrichment analysis for each GO term will count each gene many times, which is a serious problem (see Draghici, Chapter 24). Furthermore, testing the enrichment of all GO terms is not necessary and due to the unavoidable multiple comparison curse will increase the number of false positives reported. Luckily, one can leverage the structure and additional properties of GO in order to limit the number of tests performed, and therefore the number of comparisons one must correct for. In 2006, Alexa  proposed two methods to accomplish this: “Elim” and “Weight.”
In iPathwayGuide and iVariantGuide we offer both methods, each of which follow the same outline.
1) Decouple GO terms from one another
2) Perform significance tests
3) Correct for multiple comparisons
The Elim method assesses the significance of GO terms starting with the most specific terms first. The benefit of this approach is that it is easier to find specialized terms that are significant, e.g. “response to amphetamine” is more descriptive than “response to chemical.” This approach provides a very nice custom cut through the GO hierarchy that “magically” identifies the lowest level of abstraction that contains the significant GO terms in the given experiment.
Given a set of related GO terms, the Weight method is designed to identify the term that best represents the genes of interest, regardless of where the term falls in the hierarchy. This approach is less stringent than Elim, capturing more true positives with the drawback of including additional false positives.
iPathwayGuide and iVariantGuide are the only tools to provide these advanced correction factors to help you minimize false positives. Try them today for FREE and see what is truly significant in your data.
1. Khatri, P., Sirota, M., & Butte, A. J. (2012). Ten years of pathway analysis: current approaches and outstanding challenges. PLoS Comput Biol, 8(2), e1002375.
2. Rhee, S. Y., Wood, V., Dolinski, K., & Draghici, S. (2008). Use and misuse of the gene ontology annotations. Nature Reviews Genetics, 9(7), 509-515.
3. Dunn, O. J. (1959). Confidence intervals for the means of dependent, normally distributed variables. Journal of the American Statistical Association,54(287), 613-621.
4. Dunn 1961 Dunn, O. J. (1961). Multiple comparisons among means. Journal of the American Statistical Association, 56(293), 52-64.
5. Benjamini, Y. & Hochberg, Y. (1995). Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the royal statistical society. Series B (Methodological), 289-300.
6. Benjamini, Y. & Yekutieli, D. (2001). The control of the false discovery rate in multiple testing under dependency. Annals of statistics, 1165-1188.
7. Drăghici, S. (2011). Statistics and data analysis for microarrays using R and bioconductor. CRC Press. Available here.
8. Alexa, A., Rahnenführer, J., & Lengauer, T. (2006). Improved scoring of functional groups from gene expression data by decorrelating GO graph structure. Bioinformatics, 22(13), 1600-1607.
In the world of bioinformatics, we all need to be careful when analyzing our data. I receive countless questions about what background should be used when analyzing gene expression or protein expression data. In this blog post I will attempt to clarify this question.
We have all had the experience where you get a raffle ticket and the ticket says, “Must be present to win.” What they are doing is establishing the pool of candidates from which to draw a winning ticket (and also trying to keep you there, but that’s beyond the scope of this discussion). Performing pathway analysis and other enrichment analyses, is somewhat similar to that.
It is very intuitive that the size of the pool of candidates will dramatically affect the odds of winning. Using our raffle ticket example, let us say 1,000 tickets are given out and there will only be one winner. On the surface, we think our odds of winning are 1 in 1,000. But now let’s say the crowd of people that are actually present for the drawing is only 100 people. Because we “must be present to win,” our odds of winning are now actually 1 in 100. Furthermore, if the raffle organizers wanted to cheat, they could for instance have the raffle draw take place in a small room in which they invite only their friends and relatives. If this room hosts only 10 people at the time of the drawing, the odds of winning would now be 1 in 10. So the odds are really dependent on which background we choose. The same goes for pathway and other enrichment-based analyses.
Let us say you have 1,000 significant genes or proteins that were selected as differentially expressed (DE) in your condition. As I prefaced in my opening paragraph, the question becomes what background should be used when trying to understand what pathways or GO terms are significant. The p-values calculated during the analysis are just another way to tell you about the odds of a given pathway being significant just by chance. And, as we saw in the raffle experiment the choice of the background can have a dramatic effect on the results (odds). Should we use all protein coding genes? How about all genes in NCBI or Ensemble?
The answer is we should always use the set of genes that were measured. This is akin to saying, “you must be present to win.” If the gene or protein was not measured, it should not be in the mix. So if you use an arbitrary set of genes for the background (e.g. all NCBI genes, or all Ensemble genes) your statistic will be heavily skewed. All enrichment programs that have you submit only DE genes or proteins do this. Similarly, if you only use the set of DE genes as the as the background, and further select from there, you can also skew your results (this is like doing the drawing among your 10 friends and increasing your odds of success).
To exemplify this, I took the set of 1,172 significant genes (p<0.05 and Log2FC>|0.6|) from a public dataset (GSE47363) and ran it through a simple enrichment analysis. In the first experiment, I used the set of genes that were measured as the background, about 20,000 genes. Then I ran the exact same set of DE genes, but this time I used “NCBI genes” as provided by another popular web-based pathway analysis application as the background (about 30,000 genes). See Figure 1 below.
Figure 1: Comparing the same set of DEGs, but with different backgrounds. On the left, we use the set of genes that were measured (20k). On the right we use 30k genes from NCBI as the background. Notice the dramatic difference in the number of significant pathways and the p-Values.
While the top pathway is the same in both instances, you will notice little else is the same. In the first set of results, obtained with the appropriate background, we see a total of 64 significant pathways (FDR pV<0.05). The second set of results, obtained with all NCBI genes as background, there are more than 150 significant pathways! Also, you will notice the p-values are much more significant when using NCBI as the background.
You could say: “Well, but the first pathway is the same. So if a pathway is truly relevant, it will be on top no matter what the background is.” First, this is not true. The fact that the two sets of results have the same pathway at the very top is just a coincidence. Secondly, this is an incorrect way of thinking. The very purpose of the p-values is to provide us with the means to distinguish between pathways that may have some differentially expressed genes just by chance, and the pathways that maybe truly be involved with the phenotype.
All pathways with a p-value less than the significance threshold (e.g. 5%) should be carefully studied, not just the very top result, or the top three for that matter. If you have too many significant pathways and you cherry pick from them only the ones that “look familiar”, your results will be severely biased.
A better way is to go back to the criteria you used to select your differentially expressed genes, use more stringent thresholds for p-values and/or fold changes and re-do your analysis. In most cases, using reasonable thresholds for your genes, will give you a set of significant pathways that will actually offer you a good understanding of the underlying biological phenomenon. Assuming of course, that you used a good pathway analysis method. But let us leave this for another posting.
To summarize, using the proper background set of genes or proteins can have a dramatic effect on the number of significant results and the number of false positives. You have to use the entire set of genes that were measured as the appropriate background when analyzing your data. Nothing more, nothing less! This is not a recommendation, nor an advice. This is a must in order to ensure the scientific validity of your findings. This is why in iPathwayGuide, we ask you to submit your entire list of genes. If you ever use an application that only requires you to submit the significant genes, ask yourself, “What is the background being considered?”
For more on this topic, you can read:
Chapter 24 in Statistics and Data Analysis for Microarrays Using R and Bioconductor, Second Edition (Chapman & Hall/CRC Mathematical and Computational Biology)
The Advaita Bioinformatics team is built from a broad range of expertise and backgrounds. We pride ourselves in having the most capable team to deliver the best tools and knowledge for our customers.
Dr. Draghici currently holds the Robert J. Sokol MD Endowed Chair in Systems Biology in the Department of Obstetrics and Gynecology, and is a professor in the Department of Computer Science, as well as the head of the Intelligent Systems and Bioinformatics Laboratory at Wayne State University. He is also the chief of the System Biology Section in the Perinatology Research Branch of the National Institute for Child Health and Development. A senior member of IEEE, Dr. Draghici is an editor of IEEE/ACM Transactions on Computational Biology and Bioinformatics, Journal of Biomedicine and Biotechnology, and International Journal of Functional Informatics and Personalized Medicine. His publications include two books (”Data Analysis Tools for DNA Microarrays and Statistics” and ”Data Analysis for Microarrays using R”, both published by CRC Press in 2003 and 2012, respectively), 8 book chapters, and over 150 peer-reviewed journal and conference publications which gathered over 5,000 citations to date.
Dr. McEachin joins Advaita Bioinformatics in the dual role of Senior Scientist and COO. His efforts are devoted to leadership of the team of Advaita professionals, continued development of Advaita products, customer service, and day-to-day management of operations. He has more than 25 years of data analysis experience, including 15 years in bioinformatics and 10 years in operations research. Prior to joining Advaita, Dr. McEachin served for 4 years as Managing Director of the University of Michigan’s bioinformatics core. His education includes a PhD in Human Genetics and an MS in Biostatistics, both from the University of Michigan. In addition to his work at Advaita, Dr. McEachin serves as an Adjunct Assistant Professor of Biostatistics at the University of Michigan. He also volunteers for the Galactosemia Foundation and Buddy-to-Buddy.
Dr. Vanciu’s research interests are in system security, software maintenance and program comprehension. As part of his Ph.D. work at Wayne State University, he designed and developed an approach for finding security vulnerabilities in applications written in Java-like code, with an emphasis on Android applications. The approach focuses on finding vulnerabilities that are architectural flaws, such as misuse of cryptography, which are more difficult to find by existing fully automated approaches that identify only localized coding bugs among which are hard-coded password and unsafe system calls. Since he joined the Advaita team, Dr. Vanciu applies his expertise to improve the development process, lower the cost of code maintenance, as well as ensure and improve the quality of web-applications that support the analysis of gene-expression data. Such web-applications need to be secure to ensure data confidentiality, and highly flexible in order to quickly and safely adapt to the continuous enhancements required by our customers.
As Chief Science Officer, Dr. Ziraldo leads product development and technical support for Advaita’s state-of-the-art bioinformatics platform. She is passionate about producing software that puts life scientists in the driver’s seat of their data analysis, enabling them to leverage the domain expertise that they uniquely possess instead of relying solely on bioinformaticians. Before joining Advaita, Dr. Ziraldo completed a postdoctoral fellowship at the University of Michigan, working collaboratively between the departments of Chemical Engineering and Microbiology & Immunology. She received her PhD in Computational Biology in 2013 from a joint program between Carnegie Mellon and the University of Pittsburgh.
Advaita is seeking a professional sales representative. The company offers attractive compensation and benefits. Please send resume to Dr. Rich McEachin firstname.lastname@example.org
Advaita’s goal is to enable meaningful multi-omic analysis for researchers and industry users. We collaborate with organizations with similar goals, opportunities, and complementary capabilities.