Holiday Notice:
The Frontline Support team will be slow to respond December 17-18 due to an institute-wide retreat and offline December 22- January 1, while the institute is closed. Thank you for your patience during these next few weeks. Happy Holidays!

womtool fails on python code in task's command with certain formatted strings

myourshawmyourshaw University of ColoradoMember ✭✭

Yet another problem with python <<CODE in a task. womtool sometimes reacts badly to python formatted strings where the variables are dict values.

Unrecognized token on line 399, column 139:

            'OUTPUT': f"{sample_metadata['RUN_ID']}.{lane}.{sample_metadata['BARCODE_ID']}.{sample_metadata['LIBRARY_NAME']}.unaligned.bam",
                                                                                                                                          ^

or

Unrecognized token on line 395, column 39:

        o = "{}.{}.{}.{}.unaligned.bam".format(sample_metadata['RUN_ID'], lane, sample_metadata['BARCODE_ID'], sample_metadata['LIBRARY_NAME'])
                                      ^

However, this is OK:

        o = sample_metadata['RUN_ID'] + '.' + str(lane) + '.' + sample_metadata['BARCODE_ID'] + '.' + sample_metadata['LIBRARY_NAME'] + '.unaligned.bam'

Also, this f string later in the command does not cause an error:

'OUTPUT': f"{metadata['RUN_ID']}.{lane}.N.UNKNOWN.unaligned.bam",

The full task:

task CreateLibraryParamsFile {
  String python3_cmd
  File run_metadata
  Int lane

  command {
    ${python3_cmd} <<CODE
    barcode_data = []
    ubams = []

    with open('${run_metadata}', 'r') as ifh:
        metadata = json.load(ifh)

    if metadata['NUM_INDICES'] == 1:
        header = ['BARCODE_1', 'SAMPLE_ALIAS', 'LIBRARY_NAME', 'OUTPUT', 'PM', 'PI', 'DS']
    else:
        header = ['BARCODE_1', 'BARCODE_2', 'SAMPLE_ALIAS', 'LIBRARY_NAME', 'OUTPUT', 'PM', 'PI', 'DS']

    lane_metadata = metadata['LANE_METADATA'].get(str(lane))
    for sample_metadata in lane_metadata:
        barcode_1 = sample_metadata['BARCODE_1']
        barcode_2 = sample_metadata.get('BARCODE_2', '')
        barcode_dict = {
            'SAMPLE_ALIAS': sample_metadata['SAMPLE_ALIAS'],
            'LIBRARY_NAME': sample_metadata['LIBRARY_NAME'],
            'OUTPUT': f"{sample_metadata['RUN_ID']}.{lane}.{sample_metadata['BARCODE_ID']}.{sample_metadata['LIBRARY_NAME']}.unaligned.bam",
            'PM': sample_metadata['PM'],
            'PI': sample_metadata['PI'],
            'DS': sample_metadata['DS'],
            'BARCODE_1': barcode_1,
            'BARCODE_2': barcode_2,
        }
        barcode_data.append([str(barcode_dict[_]) for _ in header])

        ubams.append(barcode_dict['OUTPUT'])

    # add a catchall row for unmatched barcodes
    # do not add this to the list of ubams
    unknown_barcode_dict = {
        'SAMPLE_ALIAS': 'UNKNOWN',
        'LIBRARY_NAME': 'UNKNOWN',
        'OUTPUT': f"{metadata['RUN_ID']}.{lane}.N.UNKNOWN.unaligned.bam",
        'PM': metadata['PM'],
        'PI': '',
        'DS': '',
        'BARCODE_1': 'N',
        'BARCODE_2': 'N',
    }
    barcode_data.append([str(unknown_barcode_dict[_]) for _ in header])

    print('\t'.join(header))

    for d in barcode_data:
        print('\t'.join(d))

    # list of ubam files
    with open('ubams', 'w') as ufh:
        for u in ubams:
            ufh.write(u + '\n')
    CODE
  }
  runtime {
    memory: "1G"
    cpu: 1
  }
  output {
    File library_params = stdout()
    Array[String] ubams = read_lines("./ubams")
  }
}

Answers

Sign In or Register to comment.