Women of the African Diaspora experience a disproportionate burden of aggressive, early-onset breast cancer that lacks expression of the estrogen receptor (ER), progesterone receptor (PR) and HER-2/neu. This is in comparison to all other races for reasons that remain unknown and, importantly, understudied. The higher frequency of early-onset and aggressive breast cancer in women of African ancestry is at least in part due to differences in the distribution of genetic risk factors. The advent of massively parallel sequencing often referred to as next-generation sequencing (NGS), has brought an unprecedented opportunity to screen multiple genes simultaneously.
BROCA, a cancer risk panel which captures genomic regions including BRCA1, BRCA2, TP53, ATM, BARD1, CHEK2, PALB2, PTEN, CDH1 etc., was developed for the evaluation of patients with a suspected hereditary cancer predisposition. As the most effective strategy to examine the contribution of an inherited predisposition to breast cancer is the systematic screening for pathogenic variants in breast cancer cases as compared to controls, the Olopade Lab aims to conduct BROCA sequencing on 1,000 breast cancer patients and 1,000 controls ascertained in Ibadan, Southwest, Nigeria. This will allow us to estimate the true burden of disease caused by inherited mutations in a large cohort of women of African ancestry. We have worked on building variant calling and filtration system on Globus Genomics (globus.org/genomics), which allows biomedical researchers to quickly analyze large NGS datasets in a fully automated manner, once desirable analytic tools are appropriately installed. Currently, we employ GATK v.2.8 Haplotype Caller and Unified Genotyper, Atlas2, as well as FreeBayes to analyze both single nucleotide variants (SNVs) and insertions and deletions (Indels). SNVs and Indels called with high concordance among four models can be filtered by a Consensus Genotyper which combines the output of multiple variant calling algorithms to improve performance of the classifier. Variants identified can be annotated using multiple libraries on ANNOVAR for further analyses. To date, we have successfully sequenced 400 cases and 400 controls and the variants are being processed on the pipeline. We present the mutation spectrum and other findings at the ASHG 2014 conference.