Implementation of GATK4 for variant calling in WES of human cancer samples without reference normals
Dear GATK community,
i would like to ask a very specific question concerning the implementation of GATK toolkit for exome sequencing data. In detail, i have for 3 patients both whole exome sequencing data ( Genomic DNA captured using Agilent in-solution enrichment methodology/paired-end 75 bases massively parallel sequencing on Illumina HiSeq4000) from CTCs (circulating tumor cells) and also exome sequencing data from biopsies of the same patients. Moreover, because both biopsies and circulating tumor cells were isolated from the same timepoint of diagnosis-where the tumor has already spread due to its "specific nature", so it is not definately primary tumor in both. I have both FASTQ files and BAM files for each patient.
The main goal idea, is to identify if there are any "common mutational patterns" (ie.SNPs) between circulating tumor cells and biopsies, in the same patients, which would be very vital mainly for the validation of the CTC isolation protocol (as also for the crusial time of diagnosis of the specific cancer, relative biological mechanisms, etc). However, a major issue is that there is no reference normal tissue (that probably limits the identification of somatic variants), as also the small number of patients (6 cancer samples in total)-
but in your opinion, i could still implement GATK for germline/indel analysis, and try to focus on "rare" germline variants ? and perhaps any common types of these variants in specific genes that could be shared by both types of biological materials ? Any other ideas or suggestions would be grateful.
Please excuse me for any naive questions on this matter, as it is the first time to analyze WES data !!
Thank you in advance,