Illumina's BaseSpace is a Next Generation Sequencing (NGS) data analysis platform in the cloud, which is directly integrated with HiSeq and MiSeq sequencing instruments in a way that users can directly analyze the data produced by these machines using various software Apps hosted on this site. In order to make a software available for use in BaseSpace, software developers need to program their software enabling interaction with features (APIs) on BaseSpace, such as user level authentication, in flight data encryption and other routines before data can be accessed for analysis. Additionally, custom workflows need to be created around each App to streamline data access from Illumina's sequencing platforms that enable application programs to analyze and interpret data according to user needs (Figure 1)

Figure 1. Data from Illumina HiSeq and MiSeq machines are stored in Illumina BaseSpace. Software developers host their Apps on Illumina BaseSpace where users can analyze their sequencing data using their choice of data analysis software Apps. Each App interacts with Illumina BaseSpace framework (thin colored arrows) to access data from BaseSpace (thick blue arrows).

Next generation sequencing produces large amount of data, for example nearly 1 TB of data is generated from each HiSeq run. The raw data is deconvoluted and analyzed at a great depth to detect disease-specific variations from a background of common variations present in any human genome. As an example, ~3 million common variants (polymorphisms) are present in any human genome, of which few thousands may be specific to any given disease. Discovering disease-specific variants is the beginning of a long road to discovering disease-causing variants (example, driver mutations in cancer; actionable alterations for therapy), which are the keys to unlocking the molecular architecture of any genetic disorder. NGS has revolutionized the application of genomics to diagnosis, treatment and management of human diseases by identifying disease-causing alterations in the human genome.

To bridge the gap between clinical diagnosis and disease management from tumor sequencing data, SciGenom has built a proprietary database of cancer-specific variants by curating papers published in peer reviewed journals that is updated regularly with newly published information. Using the database as a repository of knowledge, a suite of software tools and an advanced user interface was created to interpret and visualize the data (Figure 2). OncoMD allows users to combine knowledge of protein function and cancer-related pathways with approved drugs and open clinical trials to make sense of cancer-specific mutations and generate a clinical report. Built into this package are powerful search algorithms that allow users to run various queries to extract information from the database. The most powerful, and relevant to Illumina's BaseSpace application, is the ability to upload a list of cancer-specific variants in the form of Variant Calling File (VCF) produced by Illumina's HiSeq and MiSeq platforms and obtain relevant biological information for each. A summary report linking each cancer mutation to approved drugs and open clinical trials is also provided for immediate action. The OncoMD App on BaseSpace is a hosted solution to make sense of tumor sequencing data.

Figure 2. OncoMD is a curated database of somatic and germline mutations with advanced data visualization software that allows users to interpret tumor sequencing data.

