Mastering For Loops in Bash

How to process hundreds of files without typing hundreds of commands

Alt text

If you’ve ever found yourself typing the same command over and over with slight variations, you’ve felt the pain that loops were designed to solve. Loops are one of those game-changing tools that transform tedious, error-prone manual work into elegant automation. Today, we’re diving into for loops—one of the most practical skills you’ll learn for command-line bioinformatics.

Why Loops Matter

Imagine you have 100 FASTQ files that need quality control. Would you type the same command 100 times? Of course not! That’s where loops shine. Like wildcards and tab completion, loops reduce typing and—more importantly—reduce typing mistakes. One well-crafted loop can replace hours of manual work.

The best part? Once you understand the pattern, you’ll start seeing opportunities to use loops everywhere in your workflow.

Creating Our Practice Files

Before we start looping, let’s set up some example files to work with. We will make a new dir loops and then create the files in that directory.

$ mkdir loops
$ cd loops

We’ll use the common pattern of paired-end sequencing files:

$ touch SRR534005_01_R1.fastq SRR534005_01_R2.fastq
$ touch SRR534005_02_R1.fastq SRR534005_02_R2.fastq
$ touch SRR534005_03_R1.fastq SRR534005_03_R2.fastq

Now we have six files representing three samples with forward (R1) and reverse (R2) reads. Perfect for exploring loops!

Your First Loop: Echo and Learn

Let’s start with the simplest possible loop—one that just prints filenames:

You need to note down that you need to write line by line here.

$ for i in SRR534005*
> do
>    echo $i
> done
SRR534005_01_R1.fastq
SRR534005_01_R2.fastq
SRR534005_02_R1.fastq
SRR534005_02_R2.fastq
SRR534005_03_R1.fastq
SRR534005_03_R2.fastq

Beautiful! Let’s break down what just happened:

The anatomy of a for loop:

  1. for i in SRR534005* - The shell sees the keyword for and knows it needs to repeat commands. The pattern SRR534005* matches all files starting with “SRR534005”. Each matched file gets assigned to the variable i, one at a time.

  2. do - This tells the shell: “Here come the commands I want you to repeat.”

  3. echo $i - This is the command that runs each time. The $ in front of i means “give me the value of the variable i.” First time through, $i equals SRR534005_01_R1.fastq. Last time through, it’s SRR534005_03_R2.fastq.

  4. done - This signals the end of the loop. The shell knows it’s finished.

Later on I will tell you if you want to write in one line how to do that.

Making Your Code Readable

Using i as a variable name is fine, but let’s be honest—when you come back to your code next week, will you remember what i represents? This is where descriptive variable names become your friend:

$ for file in SRR534005*
> do
>    echo $file
> done
SRR534005_01_R1.fastq
SRR534005_01_R2.fastq
SRR534005_02_R1.fastq
SRR534005_02_R2.fastq
SRR534005_03_R1.fastq
SRR534005_03_R2.fastq

Same output, but now it’s immediately clear we’re working with files. Your future self will thank you for this clarity.

A Quick Note on Variable Syntax

Sometimes you’ll need to use ${variable} instead of just $variable. The curly braces become necessary when you’re inserting the variable into a string with other characters:

$ echo $file          # Works fine
$ echo copy-$file     # Works fine
$ echo ${file}copy    # Curly braces needed here

Think of it like the difference between typing echo hello versus echo 'hello!!'—sometimes the extra syntax is optional, sometimes it’s mandatory. When in doubt, use the braces. They never hurt.

Test Before You Execute: The Echo Trick

Here’s a pro tip that will save you from countless mistakes: always test your loop with echo before running the actual command.

Let’s say we want to make backup copies of our files with “copy-“ as a prefix. Before we actually copy anything, let’s see what the new names will look like:

$ for file in *.fastq
> do
>    echo ${file} copy-${file}
> done
SRR534005_01_R1.fastq copy-SRR534005_01_R1.fastq
SRR534005_01_R2.fastq copy-SRR534005_01_R2.fastq
SRR534005_02_R1.fastq copy-SRR534005_02_R1.fastq
SRR534005_02_R2.fastq copy-SRR534005_02_R2.fastq
SRR534005_03_R1.fastq copy-SRR534005_03_R1.fastq
SRR534005_03_R2.fastq copy-SRR534005_03_R2.fastq

Perfect! We can see both the original name and what the copy will be named. This preview helps us catch mistakes before they happen. Deleted the wrong 500 files? That’s a bad day. Echoed the wrong filenames first? No harm done.

Actually Making Changes: From Echo to Action

Once we’ve verified our loop logic with echo, we can switch to the actual command. Let’s make those backup copies:

$ for file in SRR534005*; do cp $file copy-$file; done

Notice something different? This time I wrote the loop on a single line. When you use semicolons to separate the parts, you can write loops more compactly. Both formats do exactly the same thing—use whichever feels more readable to you.

Pro tip: Use the up arrow to recall your previous loop, then just change echo to cp. No need to retype everything!

Advanced Renaming: Parameter Expansion

Now let’s tackle something more sophisticated. Those SRR534005 identifiers are cryptic database accession numbers. What if we want to replace them with something more meaningful, like the actual sample name?

Let’s test the rename pattern first:

$ for file in copy-SRR534005*; do echo $file ${file//SRR534005/chicken}; done
copy-SRR534005_01_R1.fastq copy-chicken_01_R1.fastq
copy-SRR534005_01_R2.fastq copy-chicken_01_R2.fastq
copy-SRR534005_02_R1.fastq copy-chicken_02_R1.fastq
copy-SRR534005_02_R2.fastq copy-chicken_02_R2.fastq
copy-SRR534005_03_R1.fastq copy-chicken_03_R1.fastq
copy-SRR534005_03_R2.fastq copy-chicken_03_R2.fastq

What’s happening here? The syntax ${file//SRR534005/chicken} is called parameter expansion. It means: “Take the value of file, find all occurrences of SRR534005, and replace them with chicken.”

The double slash // is important—it replaces all occurrences. If you used a single slash /, it would only replace the first occurrence.

Making the Rename Real

Once we’ve verified the names look correct, we replace echo with mv to actually rename the files:

$ for file in copy-SRR534005*; do mv $file ${file//SRR534005/chicken}; done

Let’s check what we have now:

$ ls
copy-chicken_01_R1.fastq
copy-chicken_01_R2.fastq
copy-chicken_02_R1.fastq
copy-chicken_02_R2.fastq
copy-chicken_03_R1.fastq
copy-chicken_03_R2.fastq
SRR534005_01_R1.fastq
SRR534005_01_R2.fastq
SRR534005_02_R1.fastq
SRR534005_02_R2.fastq
SRR534005_03_R1.fastq
SRR534005_03_R2.fastq

Success! We now have our original files (with the SRR identifiers) and our renamed copies (with the meaningful “chicken” name). All done with one loop.

Real-World Bioinformatics Applications

Now that you’ve mastered the basics, let’s explore how loops transform actual bioinformatics workflows.

Quality Control on Multiple Samples

Instead of running FastQC on each file individually, loop through them all:

First we need to make the dir for the output files.

mkdir qc_results/
$ for file in *.fastq; do fastqc $file -o qc_results/; done

If you get an error like follows you need to install it.

Command 'fastqc' not found, but can be installed with:
apt install fastqc

Once you run this command you should see the output on the terminal like follows:

null
Started analysis of SRR534005_01_R1.fastq
Analysis complete for SRR534005_01_R1.fastq
null
Started analysis of SRR534005_01_R2.fastq
Analysis complete for SRR534005_01_R2.fastq
null
Started analysis of SRR534005_02_R1.fastq
Analysis complete for SRR534005_02_R1.fastq
null
Started analysis of SRR534005_02_R2.fastq
Analysis complete for SRR534005_02_R2.fastq
null
Started analysis of SRR534005_03_R1.fastq
Analysis complete for SRR534005_03_R1.fastq
null
Started analysis of SRR534005_03_R2.fastq
Analysis complete for SRR534005_03_R2.fastq
null
Started analysis of copy-chicken_01_R1.fastq
Analysis complete for copy-chicken_01_R1.fastq
null
Started analysis of copy-chicken_01_R2.fastq
Analysis complete for copy-chicken_01_R2.fastq
null
Started analysis of copy-chicken_02_R1.fastq
Analysis complete for copy-chicken_02_R1.fastq
null
Started analysis of copy-chicken_02_R2.fastq
Analysis complete for copy-chicken_02_R2.fastq
null
Started analysis of copy-chicken_03_R1.fastq
Analysis complete for copy-chicken_03_R1.fastq
null
Started analysis of copy-chicken_03_R2.fastq
Analysis complete for copy-chicken_03_R2.fastq

If you do ls on the output folder you will see the results generated for each fastq file.

$ ls qc_results/
SRR534005_01_R1_fastqc.html  SRR534005_02_R1_fastqc.html  SRR534005_03_R1_fastqc.html  copy-chicken_01_R1_fastqc.html  copy-chicken_02_R1_fastqc.html  copy-chicken_03_R1_fastqc.html
SRR534005_01_R1_fastqc.zip   SRR534005_02_R1_fastqc.zip   SRR534005_03_R1_fastqc.zip   copy-chicken_01_R1_fastqc.zip   copy-chicken_02_R1_fastqc.zip   copy-chicken_03_R1_fastqc.zip
SRR534005_01_R2_fastqc.html  SRR534005_02_R2_fastqc.html  SRR534005_03_R2_fastqc.html  copy-chicken_01_R2_fastqc.html  copy-chicken_02_R2_fastqc.html  copy-chicken_03_R2_fastqc.html
SRR534005_01_R2_fastqc.zip   SRR534005_02_R2_fastqc.zip   SRR534005_03_R2_fastqc.zip   copy-chicken_01_R2_fastqc.zip   copy-chicken_02_R2_fastqc.zip   copy-chicken_03_R2_fastqc.zip

This runs quality control on every FASTQ file and saves the results to a dedicated folder. One command, dozens of files processed.

Trimming Paired-End Reads (Simple Example)

When working with paired-end data, you often need to process R1 and R2 files together. Here’s a simple loop that just prints the filenames for each pair—replace the echo command with any tool you want to use later:

$ for sample in SRR534005_01 SRR534005_02 SRR534005_03
> do
>    echo "Processing ${sample}_R1.fastq and ${sample}_R2.fastq"
> done

Or, if you want to “combine” each pair into a single file (for demonstration):

$ for sample in SRR534005_01 SRR534005_02 SRR534005_03
> do
>    cat ${sample}_R1.fastq ${sample}_R2.fastq > ${sample}_combined.fastq
> done

This loop shows how to handle both forward and reverse reads for each sample, even if you don’t have Trimmomatic installed. This loop processes all three sample pairs, handling both forward and reverse reads with proper naming.

If you have real files lying around try the command and see how it works.

Counting Sequences in Multiple Files

Want to know how many sequences are in each FASTQ file?

$ for file in *.fastq
> do
>    count=$(grep -c "^@" $file)
>    echo "$file contains $count sequences"
> done

This loop counts sequence headers and prints a summary for each file. Super useful for sanity-checking your data.

Since these are dummy files we see that each file has 0 seqs. We can go back one directory and then we can see if we have some fastq files for compressed fastq.gz files.

$ cd ..
$ ls *fastq.gz
$ sample_reads.fastq.gz  tiny_n_L001_R1_xxx.fastq.gz  tiny_n_L001_R2_xxx.fastq.gz

As we can see that we have some compressed files here we will have to modify our command a little bit rather than grep we can use zgrep. Also, make sure to look for compressed files fastq.gz.

$ for file in *.fastq.gz
> do
>    count=$(zgrep -c "^@" $file)
>    echo "$file contains $count sequences"
> done

This loop counts the number of sequences in each compressed FASTQ file (*.fastq.gz) in your directory:

for file in *.fastq.gz
do
   count=$(zgrep -c "^@" $file)
   echo "$file contains $count sequences"
done

You should be able to see the similar output on your terminal:

sample_reads.fastq.gz contains 12 seqs
tiny_n_L001_R1_xxx.fastq.gz contains 850 seqs
tiny_n_L001_R2_xxx.fastq.gz contains 587 seqs

I know it got a bit complicated, let’s see the command in small steps:

Best Practices and Habits

As you incorporate loops into your workflow, keep these principles in mind:

1. Always echo first, execute second This single habit will save you from disasters. Test your logic before making irreversible changes.

2. Use descriptive variable names for file in *.fastq is infinitely more readable than for x in *.fastq. Be kind to future you.

3. Keep your loops simple If your loop is getting too complex, consider breaking it into multiple steps or writing a script instead.

4. Document your patterns When you figure out a useful loop pattern, save it in a text file or notebook. You’ll reuse these patterns constantly.

5. Check your work After running a loop, use ls or other commands to verify the results are what you expected.

Challenge: Practice Makes Perfect

Try these exercises to solidify your loop skills:

1. Prefix all your FASTQ files with today’s date Create a loop that adds a date prefix (like 2025-10-29_) to all FASTQ files.

2. Count the number of lines in each file Write a loop that prints each filename and its line count using wc -l.

3. Create a quality control workflow Combine multiple steps: run FastQC, count sequences, and check file sizes—all in one loop.

The Power of Automation

After working through these examples, you’ve unlocked a superpower: the ability to scale your work effortlessly. Whether you’re processing 6 files or 600, the loop stays the same. That’s the beauty of automation.

The real skill isn’t just knowing the syntax—it’s recognizing when a task is repetitive and having the instinct to reach for a loop. As you gain experience, you’ll find yourself naturally thinking in terms of patterns and automation.

Start small. Practice with echo. Build confidence. Soon you’ll be writing loops that save you hours of work without even thinking about it.


Loops are the bridge between manual command-line work and true automation. Master them, and you’ll never go back to doing things one file at a time.

← Previous Next →