ApacheCon NA 2013

Portland, Oregon

February 26th – 28th, 2013

Register Now!

Wednesday 2:30 p.m.–3:30 p.m.

Searching for cancer biomarkers with Apache OODT

Rishi Verma

Apache in Science
Audience level:


This talk explores Apache Object Oriented Data Technology’s (OODT) role in advancing cancer research for the Early Detection Research Network (EDRN). Crucially, OODT is helping the EDRN manage diverse sets of analytical software used to process biomedical data. This talk introduces the science behind the process and examines how OODT’s software is helping advance scientific goals.


Apache Object Oriented Data Technology (OODT) is a software framework advancing the data management needs of increasingly data intensive scientific projects. Its application reaches many domains, including realms as diverse as planetary data, earth science, spacecraft data archiving for NASA, and biomedicine. This talk investigates an exciting new direction within OODT’s biomedical domain - where OODT’s data processing technologies are supporting the automated discovery of cancer biomarkers for the National Cancer Institute’s Early Detection Research Network (EDRN). Using components like OODT Workflow Manager and Process Generation Executives (PGEs), very dissimilar analytical "software pipelines" to identify cancer biomarkers are being managed and allowed to run on identical experimental files. In other words, OODT is making it easier for researchers to use multiple analysis techniques when processing cancer-specific biomedical data, so that reliable biomarkers can more accurately be identified.

The talk will begin by providing an introduction to the science of cancer biomarker discovery. It will then investigate OODT’s data processing technologies in-depth, including OODT Workflow Manager, PGEs, and OODT Resource Manager. A case study of an actual cancer biomarker analysis pipeline being designed for the EDRN will further be presented. Finally, the talk will conclude with a recap of how OODT’s data processing technologies can be implemented quickly into new projects, as well as how the direction of such work will be advancing in the future.