Examples: Monday, today, last week, Mar 26, 3/26/04
Avatar

Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Badges

10000 CommentsFourth Anniversary500 Likes25 LOLs100 AgreesThird Anniversary5 LOLs250 LikesSecond Anniversary5000 Comments25 AgreesFirst Anniversary100 Likes2500 Comments5 WTFs1000 Comments25 Likes500 Comments5 Agrees5 Likes100 CommentsName Dropper10 CommentsPhotogenicFirst Comment

Get notifications!


You can opt in to receive email notifications, for example when your questions get answered or when there are new announcements, by following the instructions given here.

Formatting tip!


Wrap blocks of code, error messages and BAM/VCF snippets--especially content with hashes (#)--with lines with three backticks ( ``` ) each to make a code block as demonstrated here.

Jump to another community

Geraldine_VdAuwera admin

Folks, I'm sorry but I can't respond to questions on this page. Please post your questions in the public forum.

About

Username
Geraldine_VdAuwera
Location
Cambridge, MA
Joined
Visits
5,296
Last Active
Roles
Member, Administrator, Broadie
Points
7,431
Badges
25
Twitter
@gatk_dev
Location
Cambridge, MA
Full Name
Geraldine Van der Auwera
Posts
11,728

Activity

  • Yogesh

    I have a whole genome resequencing Illumina reads from two contrasting genotypes.
    I have few queries regarding GATK analysis.

    Objective: I want to identify the homozygous SNP and Indels between these two genotypes by mapping raw read against the reference genome.

    what are the prefiltering parameter need to take care before starting the GATK pipeline?

    I already removed the adapter and low-quality bases from reads, do I need to remove repetitive reads also, if yes then please suggest how to do it? What are the other pre-read filtering parameter that also I should need to look?

    In GATK pipeline why we are creating sequence dictionary? where it is used? What it the role of assign read group? how do I assign read group, does it has specifc feature or just any random name I can put?

    Create sequence dictionary

    java -jar~/bin/picard-tools-1.8.5/CreateSequenceDictionary.jar REFERENCE=reference.fasta OUTPUT=reference.dict

    Align reads and assign read group

    bwa mem -R “@RG\tID:FLOWCELL1.LANE1\tPL:ILLUMINA\tLB:test\tSM:PA01” reference.fasta R1.fastq.gz R2.fastq.gz > aln.sam

    April 20