is gatk mutect2 can be used inProcessPoolExecutor or ThreadPoolExecutor? I get error

Exception: b"Using GATK jar /gatk-4.0.0.0/gatk-package-4.0.0.0-local.jar\nRunning:\n java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble

I use the following script to deal with each sample _**, sample_id is a list of many files.**_
with ProcessPoolExecutor(self.workers) as future:
log = future.map(create_each_sample_vcf, sample_id)

the func create_each_sample_vcf starts with fastq , finally produce vcf.
it contains the follwing steps(trimmomatic, samtool sort and index , mutect2 call vcf.). the ahead three steps is ok, but when it comes tomutect2 call vcf, it raise
"
Exception: b"Using GATK jar /gatk-4.0.0.0/gatk-package-4.0.0.0-local.jar\nRunning:\n java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble
"

I also tried ThreadPoolExecutor as following scripts, but the data can be mixed into other sample directories or it did not end gatk mutect2 call vcf, but ran into a new func createsomaticpon .
with ThreadPoolExecutor(self.workers) as future:
log = future.map(create_each_sample_vcf, sample_id)

My gatk mutect command is like such
"
cmd_mutect_vcf = "{} Mutect2 --intervals {} --native-pair-hmm-threads 16 --input {}.sorted.bam --output {}.somatic.raw.vcf --reference {} --tumor-sample {}".format("gatk", self.bed, sample_id, sample_id, self.hg19, sample_id)
p = subprocess.Popen(cmd_mutect_vcf, stdout=subprocess.PIPE, stderr=subprocess.PIPE, shell=True)
stdout, stderr = p.communicate()
if p.returncode:
raise Exception(stderr)
"

Is gatk func are not followed to run on multiprocess or multithread? urgent? :( :(

Tagged:

Best Answers

Answers

Sign In or Register to comment.