3 Basic challenge

It’s now time to put into practice what you learned today! Try to solve the following challenge. The aim is to write a functioning workflow that takes in input the following sample sheet

filename,value,number_of_rows,number_of_columns
file1.txt,3,10,10
file2.txt,7,5,45
file3.txt,a_value,45,1
file4.txt,trweter,43,9
file5.txt,109,14,3
file6.txt,aaa,1,12
file7.txt,g,96,76
file8.txt,eew,11,11
file9.txt,1ww,21,34
file10.txt,45,8,2
file11.txt,jh,6,1
file12.txt,96,1,5

For each file in the sample sheet (under the filename column), the workflow should create in output a file with that name and place it in the folder ../nextflow_output/challenge. Each file should contain the content of the column value for that file. This content should be repeated number_of_columns times in the same line, each instance separated by a comma. The file should contain the above-mentioned row repeated number_of_rows times. So in the end, each file should be essentially a CSV file with a number of rows equal to the value contained in the column number_of_rows, and a number of columns equal to the value contained in the column number_of_columns. Those files should be without a header, and each cell should contain the value stored in the column value.

For example, ../nextflow_output/challenge/file1.txt should contain

3,3,3,3,3,3,3,3,3,3
3,3,3,3,3,3,3,3,3,3
3,3,3,3,3,3,3,3,3,3
3,3,3,3,3,3,3,3,3,3
3,3,3,3,3,3,3,3,3,3
3,3,3,3,3,3,3,3,3,3
3,3,3,3,3,3,3,3,3,3
3,3,3,3,3,3,3,3,3,3
3,3,3,3,3,3,3,3,3,3
3,3,3,3,3,3,3,3,3,3

and ../nextflow_output/challenge/file6.txt should contain

aaa,aaa,aaa,aaa,aaa,aaa,aaa,aaa,aaa,aaa,aaa,aaa

You can use as many or as few processes as you want to achieve this result. The challenge is possible using only the Nextflow features that we discussed, and a bit of bash or Python/R scripting.

You can find the solution to the challenge by disclosing the following block.

Click here to see the solution (main.nf file)
nextflow.enable.dsl = 2

process create_matrix {
    conda "r-base r-tidyverse"

    publishDir "../nextflow_output/challenge"

    input:
        tuple(
            val(outname),
            val(fill_value),
            val(nrows),
            val(ncols)
        )
    output:
        path "$outname"

    script:
        """
        #!/usr/bin/env Rscript

        library("tidyverse")

        content <- rep("$fill_value", ${ncols}*${nrows})
        mat <- matrix(content, ${nrows}, ${ncols})

        df <- as.tibble(mat)

        write_csv(df, "${outname}", col_names=FALSE)
        """


}

workflow {
    Channel.fromPath( params.input )
        .splitCsv(header: true)
        .map{
            [ it["filename"], it["value"], it["number_of_rows"], it["number_of_columns"] ]
        }
        .set{ input_ch }

    create_matrix( input_ch )
}

If the sample sheet is saved as ../nextflow_output/samplesheet.csv, the solution can be run with the following command

nextflow run main.nf --input ../nextflow_output/samplesheet.csv