Big data startups that are rewriting the rule books of drug discovery

big data startups are using analytics to cure cancer and ageing

Big data startups are rewriting the rule books of drug discovery

Big data startups foraying into the drug discovery business was a critical and much needed intervention. And thankfully, over the past couple of years, we have seen a number of startups emerge in this field. These companies mostly have eminent members of the scientific community at their helm, focusing on utilizing proprietary technology to accelerate drug discovery. Using innovative business models, they are not only helping other pharmaceuticals players in accelerating drug discovery but are also investing much needed funds and efforts into drug discovery for mostly neglected ailments. These specific ailments were neglected because their R&D is generally considered capital-intensive and shrouded with uncertainty of favourable outcome. This was an area where even the big pharmaceuticals, who are periodically answerable to shareholders for consistency in generated returns, were reluctant to stay put for a longer horizon. In today’s article, I have looked at a few such startups that are rewriting the rule books of drug discovery.

Berg Pharmaceuticals
Framingham, MA based Berg Pharma is a drug discovery business that has been making some news off late. Employing more than 200 people, Berg houses a drug discovery platform and runs programs in cancers, diabetes, arthritis and Parkinson’s disease. Recently, it was reported that Berg has entered into partnerships with several R&D institutions and labs. These include Beth Israel Deaconess Medical Center, Harvard Medical School, and the Pancreatic Cancer Research Team. The collaborations are meant to develop the first ever biomarker for pancreatic cancer. And just about a couple of weeks ago, Berg presented the clinical research from its trials using its cancer drug BPM 31510 (phase 1b testing). The company is marketing the drug as one of the first cancer drugs guided in development by artificial intelligence. Deploying an indigenously developed framework called the Berg Interrogative Biology platform, the startup claims that it studies molecular entries from diverse and unparalleled data sets including genomics, proteomics, lipidomics, transcriptomics, and metabolomics which is then subjected to machine-learning processing. Berg is self-funded by billionaire Carl Berg and has ambitious plans to reduce the drug development timeline by half. With pharma companies more focused on return driven drug discovery investments, Berg is creating a niche in building a standalone parallel drug discovery ecosystem.

Numedii is a Stanford University spinoff drug discovery startup that came into limelight in 2013 when it raised $3.5 million in Series A funding. Unlike Berg, Numedii hasn’t narrowed its focus on a few disease types. Using an indigenously developed big data technology, it conducts predictive drug effectiveness analysis on life sciences data, correlating disease information with drug data. It then translates this information into novel therapeutic candidates. So far, Numedii has forged two external partnerships. In 2012, it signed a product development deal with Aptalis Pharma, a developer of therapies cystic fibrosis and gastrointestinal disorders. In 2013, it partnered with Thomson Reuters with the intent of using the latter’s data expertise in systems biology, in conjunction with its own technology and database, to find FDA-approved drugs or discontinued development compounds that are appropriate for repurposing. So far, we haven’t heard of any product development news from Numedii.

InSilico Medicine
Insilico Medicine is a Baltimore MD based 2014 data informatics startup that employs Graphical Processing Unit (GPU) technology, in combination with big data analytics, for in silico drug discovery and drug repurposing for aging and age-related diseases. So far, they have developed four products: OncoFinder – a tool for drug discovery and personalized medicine platform, GeroScope – a drug repurposing and discovery platform for aging and aging-related diseases, PharmAtlas – a comprehensive drug and toxicity database and Pathway Cloud Intelligence, the company’s knowledge management system. It also provides customized services to pharmaceutical companies. It employs 33 bio-scientists and in February this year, raised $800k funding as part of its planned external funding.

Another notable startup in ageing related domain is the San Francisco based 2013 Calico (California Life Company). Google funded Calico was introduced to the world as a company meant “for curing death”. Focusing on neurodegeneration and cancer, it has forged strong R&D partnerships with AbbVie, University of Texas Southwestern Medical Center, 2M Companies, Broad Institute of MIT and Harvard, Buck Institute of Research and Ageing, UCSF (Peter Walter Lab) and QB3. Calico have licensed experimental drug compounds P7C3 analogues that inhibits the enzyme NAMPT which plays a role in NAD biosynthesis (neurodegeneration).

Cyclica is an Ontario, Canada based 2010 startup engaged in big data deployment for accelerating in silico drug discovery. Using proprietary algorithms, it assesses interactions between a drug and all known proteins, and provides deeper insights into the drug’s side effect, toxicity, and therapeutic profiles. So far, it has raised $2.3 million in seed financing and reportedly, acquired royalty rights for two drug candidates. It has also built strategic partnerships with IBM (to access its Blue Gene supercomputer) and Yale Center for Molecular Discovery. According to reports, the firm is planning to raise $5-8 million Series A financing this year and setting up a US based office is also on its radar.

A 2012 academic spin off from Indiana University’s (IU) School of Informatics and Computing, Data2Discovery is planning to commercialize what it calls semantic link association prediction, or SEMAP, whose purpose is to find associations between drugs and gene targets with the help of semantics. What Data2Discovery has created to address this problems is a comprehensive, integrated dataset that combines public and private information and brings it together in a way that is searchable through plain language. With the help of the Indiana Clinical and Translational Sciences Institute, the company is already working to create a map of the molecular connections of type 2 diabetes to help better predict what effects the disease will have.

Cypher Genomics
Cypher Genomics is a 2011 San Diego based genome informatics startup engaged in rapidly advancing genome sequencing technologies. In 2011, it collaborated with STSI in a deal that gave it access to Wellderly database in clinical trials. Its flagship service portfolio is its biomarker discovery services, for which it also signed a partnership deal with Illumina last year. The deal has given Cypher access to Illumina’s large sales force and will help it in offering its genome analysis platform to pharmaceutical companies. This year, it found a partner in Celgene for accelerating the latter’s drug discovery plans. It employs approx. 10 people. Other startups we came across during our research included Knome, SolveBio, Appistry, LifeCode, Ingenuity, DNAnexus and Curoverse. Google backed personal genetics startup 23andMe has also entered this business last year.

Anubhav, a data scientist, writes about new developments and future trends in the machine learning and data analytics domain.
He can be reached at
Follow him on Twitter at:

Leave a Reply