Yes, it most certainly can @claire1011. The GATK4 CNV workflows work efficiently on both exomes and WGS data.
Hi @claire1011, you will need the target capture regions from your WES kit provider. This will likely be a BED file that you will then convert to Picard-style intervals list with BedToIntervalList (Picard). It is important that you subset your analysis to regions where you expect coverage. Please see the tutorial at https://software.broadinstitute.org/gatk/documentation/article?id=11682 to get started.
@claire1011, it is possible to run ModelSegments CNV with a single-sample PoN, e.g. a PoN consisting of a single matched normal. To create the single-sample PoN, provide the normal's coverage data to CreateReadCountPanelOfNormals like so:
gatk CreateReadCountPanelOfNormals \
-I sandbox/normal.counts.tsv \
You may want to adjust some parameters, e.g. --minimum-interval-median-percentile to a lower number like 5.0 instead of the default 10.0 to retain more data points given the samples are from the same individual.
Note that a multi-sample PoN using batch-matched normals may give cleaner results and you may want to investigate a variety of denoising options. With a multi-sample PoN, you can denoise your tumor and normal samples independently, and confirm your normals are indeed normal as in 4B and 4D. In this case, you will have to procure normal samples that use the same capture kit as your samples. We recommend a minimum of 10-40 normal samples.
If all seven of your samples are from the same patient and assuming these samples are batch-matched, then you could create the matched-normal PoN with your two control samples.
As for comparing your multiple groups, you can use the provided per-sample plotting tools and compare the plots side by side and/or take your data to IGV for multi-sample visualization. I have not tried visualizing these myself but a new feature of the ModelSegments CNV workflow that was implemented in recent releases (v220.127.116.11 and v18.104.22.168) is the output of IGV-compatible SEG files. This format should be sufficient for exome results.
@bhanuGandham asked that I follow up on your question.
If as you said earlier you need to confirm your normal samples are indeed normal, then as I've stated previously the best approach is to create a multi-sample PoN. For experimental mouse models, because strains are fairly distinct from each other and pretty much identical within a strain, and because the PoN is meant to capture systematic noise, I think you can use either biological or technical replicates of RMS strain samples towards the PoN.
If you lack access to additional RMS normals, then the next best thing is to use each normal sample as the PoN for the other, e.g. RMS1 against an RMS2-PoN and RMS2 against an RMS1-PoN. If each normal confirms as normal, then you can pool these together for a 2-sample PoN to use with the tumor samples. Please do not include the sample of interest--the case sample--in the analysis PoN. Hopefully, each normal sample confirms as normal as expected and any differences you observe are either parental germline differences (I suppose unlikely for cloned mice) or part of the noise that we wish to capture. Definitely worth sussing out the normal samples to rule out any sample swaps between normal and tumor samples. Sample swaps happen more often than we like to think.
If each tumor sample presents CNVs that are distinct from the other tumor samples, then there is a third option where you could use the two normals and four of the five tumor samples (total six samples) within a PoN to analyze the fifth remaining tumor sample. There are a number of filtering steps to remove outlier data in PoN creation and so the PoN would effectively omit such rare events. You will have to double-check PoN creation filtering parameters to make sure they are appropriate for the six samples. If your sample set is amenable to such a round-robin approach, your denoising would be empowered. You should carefully consider each tumor sample for appropriateness in this approach. Towards sussing out coverage extremes in samples, I can recommend FilterIntervals, which is a tool I just happened to study carefully yesterday towards gCNV tutorial writing.
I hope you get interesting results with your tumor samples!