NGS and Perl Problem Set

FASTQ file for these problems is located at: /pfbhome/data/secret_seqs.fastq

Problem 1

  1. Write a script to count the number of sequences in a FASTQ file.
  2. Output the mean and standard deviation of sequence lengths
  3. Calculate the mean and standard deviation of base quality scores.

Problem 2

  1. Write a script to trim each sequence in a FASTQ file starting from the first base in each sequence lower than Q=20 to the end of the sequence. (don't forget to trim the quality scores as well)