5-2: The reverse-complement sequence web application

In this section we will write a web application able to reverse, complement, or reverse-complement a DNA sequence. Starting from a DNA sequence, the reverse-complement operation enables to compute the sequence of the complementary strand, as already discussed in section 4-7 of this book where we have also provided a simple code able to achieve this operation. Building up on this code, in section 4-12 we have written a PHP function able to perform the task.

Single sequence version

We will now leverage on a slightly improved version of this function, which supports IUPAC characters, to build a web application that can compute the reverse, complement, or reverse-complement of sequences provided by users, in a web context.

We will also use the sequence breaker function seqbreak() to introduce a break every 80 nucleotides, to format the output sequence.

Using the previously written functions, stored in a functions.php file that is then imported in the script file with an “include” statement, will allow the script.php code to be extremely compact.

The application will support the FASTA format, the IUPAC code for degenerate sequences, and will have options to select the kind of transformation (reverse, complement or reverse-complement) to be applied on the input sequence.

The reverse-complement application web form

The output of the reverse-complement web application

In this first version of the application we will accept as input a single DNA sequence. At the end of this section we propose a version able to handle several sequences.

The code

As usual, the code for the web application will be distributed across several files. The general structure is the same as the one of the application developed in the previous section. Directories names are in bold.

reverse-complement
    index.php
    script.php
    html
        header.html
        footer.html
    css
        style.css
    include
        functions.php

header.html


<!DOCTYPE html>
<html lang="en">
    <head>
        <meta charset="UTF-8">
        <title>A script to reverse-complement a DNA sequence</title>
        <link rel="stylesheet" href="css/style.css" type="text/css">
    </head>
    <body>
        <div id="main-contents">
            <h1>Reverse, complement or reverse-complement a DNA sequence</h1>

<!DOCTYPE html>

<head>

<title>A script to reverse-complement a DNA sequence</title>

</head>

<body>

<h1>Reverse, complement or reverse-complement a DNA sequence</h1>

footer.html


        </div>
        <footer>
            Contact us at webmaster@mywebsite.com
        </footer>
    </body>
</html>

</div>

</footer>

</body>

</html>

index.php


        <?php echo file_get_contents("html/header.html"); ?>
        <form action="script.php" method="POST" enctype="multipart/form-data" id="revcomp_form">
            <fieldset id="data">
                <legend>Data</legend>
                <p>
                    <label for="fasta_sequence">Your DNA sequence</label><br>
                </p>
                <p>
                    <textarea name="fasta_sequence" id="fasta_sequence"></textarea>
                </p>
            </fieldset>
            <fieldset id="options">
                <legend>Options</legend>
                <p>
                    <span class="field-title" style="margin-bottom:10px;">Select transformation</span>
                </p>
                <p>
                    <input type="radio" name="transformation" value="rev" id="rev"> <label for="rev" class="radio">Reverse</label><br>
                    <input type="radio" name="transformation" value="comp" id="comp"> <label for="comp" class="radio">Complement</label><br>
                    <input type="radio" name="transformation" value="revcomp" id="revcomp" checked> <label for="revcomp" class="radio">Reverse-complement</label>
                </p>
            </fieldset>
            <input type="submit" value="Go!">
        </form>
        <?php echo file_get_contents("html/footer.html"); ?>

<?php echo file_get_contents("html/header.html"); ?>

<p>

<label for="fasta_sequence">Your DNA sequence</label><br>

</p>

<p>

</p>

</fieldset>

<legend>Options</legend>

<p>

<span class="field-title" style="margin-bottom:10px;">Select transformation</span>

</p>

<p>

<input type="radio" name="transformation" value="rev" id="rev"> <label for="rev" class="radio">Reverse</label><br>

<input type="radio" name="transformation" value="comp" id="comp"> <label for="comp" class="radio">Complement</label><br>

<input type="radio" name="transformation" value="revcomp" id="revcomp" checked> <label for="revcomp" class="radio">Reverse-complement</label>

</p>

</fieldset>

</form>

<?php echo file_get_contents("html/footer.html"); ?>

functions.php

The reverse-complement function – revcomp() – used in this section is modified with respect to the one proposed in section 4-12 so as to support all IUPAC characters for nucleotides (A, C, G, T, U, R, Y, S, W, K, M, B, D, H, V, N, ., -). More specifically, the complement dictionary associative array was extended.


<?php
function process_fasta($fasta_sequence, $mode="all"){
    $fasta_lines = explode("\n", $fasta_sequence);
    $header = "> Generic"; // We will store the header line here during the next foreach cycle
    $sequence = ""; // We will store the sequence here during the next foreach cycle
    foreach($fasta_lines as $line){
        // We strip possible whitespace (or other characters) from the beginning and end of the line
        $line = trim($line);
        if(preg_match("/^>/", $line)){  // If the line starts with a > it's the header line
            $header = $line;
        }
        elseif($line != ""){
            $sequence = $sequence.$line; // We concatenate each new sequence line in the $sequence variable
        }
    }
    
    // At this point we should have the FASTA header in the $header variable 
    // and the whole sequence in the $sequence variable

    // And now the return part, that depends on value of $mode
    if($mode == "all"){
        return array($header, $sequence);
    }
    elseif($mode == "seq"){
        return $sequence;
    }
    elseif($mode == "header"){
        return $header;
    }
    else{
        return "WARNING: process_fasta mode not supported";
    }
}

function revcomp($sequence, $mode="revcomp"){
    $complement_dict = array(
    "A" => "T",
    "T" => "A",
    "G" => "C",
    "C" => "G",
    "U" => "A",
    "R" => "Y",
    "Y" => "R",
    "S" => "W",
    "W" => "S",
    "K" => "M",
    "M" => "K",
    "B" => "D",
    "D" => "B",
    "H" => "V",
    "V" => "H",
    "N" => "N",
    "." => ".",
    "-" => "-"
    );

    $nucleotides = str_split($sequence,1);
    // Let's compute the complement sequence first
    $complement_sequence = "";
    foreach($nucleotides as $nucleotide){
        $complement_sequence = $complement_sequence.$complement_dict[$nucleotide];
    }
    // The complement sequence is now stored in the $complement_sequence variable
 
    $revcomp_sequence = strrev($complement_sequence); // This is the reverse complement sequence
    $reverse_sequence = strrev($sequence); // This is the reverse sequence

    // We return different things depending on the $mode (second optional argument of this function)
    // if we call the function with just one argument, the value of $mode will be the default, "revcomp"
    // additional supported values for the $mode argument are "comp" and "rev", see below
    // Note that when a function returns, it also exits, no more code inside the function is executed

    if($mode == "revcomp"){
        return $revcomp_sequence;
    }
    elseif($mode == "comp"){
        return $complement_sequence;
    }
    elseif($mode == "rev"){
        return $reverse_sequence;
    }
    else{  // This part may help us in debugging code in which the function is used
        return "WARNING: revcomp mode not supported";
    }
}

function seqbreak($sequence, $brlen=80, $brel="<br>\n"){ // $brel => breaking element
    $chars = str_split($sequence, 1);
    $i = 1;
    $out = "";
    foreach($chars as $char){
        if(is_int($i/$brlen)){
            $out = $out.$char.$brel;
        }
        else{
            $out = $out.$char;
        }
        $i++;
    }
    return $out;
}
?>

100

101

102

103

104

<?php

function process_fasta($fasta_sequence, $mode="all"){

$fasta_lines = explode("\n", $fasta_sequence);

$header = "> Generic"; // We will store the header line here during the next foreach cycle

$sequence = ""; // We will store the sequence here during the next foreach cycle

foreach($fasta_lines as $line){

// We strip possible whitespace (or other characters) from the beginning and end of the line

$line = trim($line);

if(preg_match("/^>/", $line)){ // If the line starts with a > it's the header line

$header = $line;

}

elseif($line != ""){

$sequence = $sequence.$line; // We concatenate each new sequence line in the $sequence variable

}

// At this point we should have the FASTA header in the $header variable

// and the whole sequence in the $sequence variable

// And now the return part, that depends on value of $mode

if($mode == "all"){

return array($header, $sequence);

}

elseif($mode == "seq"){

return $sequence;

}

elseif($mode == "header"){

return $header;

}

else{

return "WARNING: process_fasta mode not supported";

}

function revcomp($sequence, $mode="revcomp"){

$complement_dict = array(

"A" => "T",

"T" => "A",

"G" => "C",

"C" => "G",

"U" => "A",

"R" => "Y",

"Y" => "R",

"S" => "W",

"W" => "S",

"K" => "M",

"M" => "K",

"B" => "D",

"D" => "B",

"H" => "V",

"V" => "H",

"N" => "N",

"." => ".",

"-" => "-"

);

$nucleotides = str_split($sequence,1);

// Let's compute the complement sequence first

$complement_sequence = "";

foreach($nucleotides as $nucleotide){

$complement_sequence = $complement_sequence.$complement_dict[$nucleotide];

}

// The complement sequence is now stored in the $complement_sequence variable

$revcomp_sequence = strrev($complement_sequence); // This is the reverse complement sequence

$reverse_sequence = strrev($sequence); // This is the reverse sequence

// We return different things depending on the $mode (second optional argument of this function)

// if we call the function with just one argument, the value of $mode will be the default, "revcomp"

// additional supported values for the $mode argument are "comp" and "rev", see below

// Note that when a function returns, it also exits, no more code inside the function is executed

if($mode == "revcomp"){

return $revcomp_sequence;

}

elseif($mode == "comp"){

return $complement_sequence;

}

elseif($mode == "rev"){

return $reverse_sequence;

}

else{ // This part may help us in debugging code in which the function is used

return "WARNING: revcomp mode not supported";

}

function seqbreak($sequence, $brlen=80, $brel="<br>\n"){ // $brel => breaking element

$chars = str_split($sequence, 1);

$i = 1;

$out = "";

foreach($chars as $char){

if(is_int($i/$brlen)){

$out = $out.$char.$brel;

}

else{

$out = $out.$char;

}

$i++;

}

return $out;

}

style.css


body{
    width: 800px;
    margin-right:auto;
    margin-left:auto;
}

#main-contents{
    border:4px solid tomato;
    margin-right:auto;
    margin-left:auto;
    margin-bottom: 20px;
    padding:20px;
    padding-top:0;
}

span.sequence{
    font-family:courier;
    font-size:14px;
}

span.header{
    font-size:14px;
}

label{
    cursor:pointer;
    font-weight:bold;
    color:teal;
}

label.radio{
    font-weight:normal;
}

span.field-title{
    font-weight:bold;
    color:teal;    
}

fieldset{
    border:1px solid tomato;
    margin-bottom:20px;    
}

legend{
    font-weight:bold;
    color:tomato;
}

input[type=submit] {
    background-color: tomato;
    border: none;
    color: white;
    padding: 5px 10px;
    text-decoration: none;
    margin: 4px 2px;
    cursor: pointer;
    text-transform: uppercase;
    font-weight:bold;
}

h1{
    color:tomato;
    font-size:1.7em;
}

textarea{
    width:500px;
    height:200px;
    font-family:courier;
    background:whitesmoke;  
}

input[type="text"]{
    background:whitesmoke; 
}

footer{
    text-align:center;
}

body{

width: 800px;

margin-right:auto;

margin-left:auto;

}

#main-contents{

border:4px solid tomato;

margin-right:auto;

margin-left:auto;

margin-bottom: 20px;

padding:20px;

padding-top:0;

}

span.sequence{

font-family:courier;

font-size:14px;

}

span.header{

font-size:14px;

}

label{

cursor:pointer;

font-weight:bold;

color:teal;

}

label.radio{

font-weight:normal;

}

span.field-title{

font-weight:bold;

color:teal;

}

fieldset{

border:1px solid tomato;

margin-bottom:20px;

}

legend{

font-weight:bold;

color:tomato;

}

input[type=submit] {

background-color: tomato;

border: none;

color: white;

padding: 5px 10px;

text-decoration: none;

margin: 4px 2px;

cursor: pointer;

text-transform: uppercase;

font-weight:bold;

}

h1{

color:tomato;

font-size:1.7em;

}

textarea{

width:500px;

height:200px;

font-family:courier;

background:whitesmoke;

}

input[type="text"]{

background:whitesmoke;

}

footer{

text-align:center;

}

script.php


<?php
include("include/functions.php"); // We include the functions.php file
$fasta_sequence = $_POST["fasta_sequence"]; // and grab the sequence submitted by the user through the web form
// We call it fasta_sequence rather than sequence as it may well be a FASTA sequence. The name of the
// field in the web form (index.php) was also changed accordingly

$transformation = $_POST["transformation"]; // This will be revcomp (web form default), rev or comp
 
$sequence = strtoupper(process_fasta($fasta_sequence, "seq")); // We use process_fasta() to extract the "raw" sequence 
// from the fasta_sequence and also make sure it is in uppercase characters
$header = process_fasta($fasta_sequence, "header"); // We use process_fasta() to extract the header

// We leverage on the revcomp() function to get the transformed sequence
// Note how things are easy when we have a function to do the job

$t_sequence = revcomp($sequence, $transformation);

// We create the sequence ready for output by using the seqbreak() function
// that introduces a break tag every 80 amino-acids

$breaked_sequence = seqbreak($t_sequence);
$out_sequence = "<span class=\"sequence\">".$breaked_sequence."</span>";

// We generate a $t_txt variable (transformation text) that is the complete version of the selected transformation
// So we will be able to use "reverse-complement" instead of "revcomp", et-cetera.
if($transformation == "revcomp"){
    $t_txt = "reverse-complement"; 
}
elseif($transformation == "rev"){
    $t_txt = "reverse";    
}
elseif($transformation == "comp"){
    $t_txt = "complement";    
}

// We create a new header for the transformed sequence by appending "reverse-complement", "reverse" or "complement"
// to the original header, as appropriate
$header_new = $header." - $t_txt";
 
// We now have all we need to provide an output to the user
// We embed the output data within the same header and footer HTML code used in the web form
// to ensure a consistent navigation experience and provide the feel that
// everything takes place "in the same website"
 
echo file_get_contents("html/header.html"); // Writing the header HTML to the output page
 
echo "<h2>Transformed sequence</h2>\n";
echo "<p><strong>Selected transformation:</strong> $t_txt</p>";
echo "<p><span class=\"header\">$header_new</span><br>\n";
echo "$out_sequence</p>\n";
 
echo file_get_contents("html/footer.html"); // Writing the footer HTML to the output page
?>

<?php

include("include/functions.php"); // We include the functions.php file

$fasta_sequence = $_POST["fasta_sequence"]; // and grab the sequence submitted by the user through the web form

// We call it fasta_sequence rather than sequence as it may well be a FASTA sequence. The name of the

// field in the web form (index.php) was also changed accordingly

$transformation = $_POST["transformation"]; // This will be revcomp (web form default), rev or comp

$sequence = strtoupper(process_fasta($fasta_sequence, "seq")); // We use process_fasta() to extract the "raw" sequence

// from the fasta_sequence and also make sure it is in uppercase characters

$header = process_fasta($fasta_sequence, "header"); // We use process_fasta() to extract the header

// We leverage on the revcomp() function to get the transformed sequence

// Note how things are easy when we have a function to do the job

$t_sequence = revcomp($sequence, $transformation);

// We create the sequence ready for output by using the seqbreak() function

// that introduces a break tag every 80 amino-acids

$breaked_sequence = seqbreak($t_sequence);

$out_sequence = "<span class=\"sequence\">".$breaked_sequence."</span>";

// We generate a $t_txt variable (transformation text) that is the complete version of the selected transformation

// So we will be able to use "reverse-complement" instead of "revcomp", et-cetera.

if($transformation == "revcomp"){

$t_txt = "reverse-complement";

}

elseif($transformation == "rev"){

$t_txt = "reverse";

}

elseif($transformation == "comp"){

$t_txt = "complement";

}

// We create a new header for the transformed sequence by appending "reverse-complement", "reverse" or "complement"

// to the original header, as appropriate

$header_new = $header." - $t_txt";

// We now have all we need to provide an output to the user

// We embed the output data within the same header and footer HTML code used in the web form

// to ensure a consistent navigation experience and provide the feel that

// everything takes place "in the same website"

echo file_get_contents("html/header.html"); // Writing the header HTML to the output page

echo "<h2>Transformed sequence</h2>\n";

echo "<p><strong>Selected transformation:</strong> $t_txt</p>";

echo "<p><span class=\"header\">$header_new</span><br>\n";

echo "$out_sequence</p>\n";

echo file_get_contents("html/footer.html"); // Writing the footer HTML to the output page

You may test the script live here.

Batch version

Let’s now write a version of this application able to process several FASTA sequences at the same time, in batch.

To accept multiple sequences in input, we will switch the FASTA processing function, from process_fasta() to fasta_sequences_to_array(). We have already written the code for both functions in section 4-12.

The header, footer and css files remain unchanged with respect to the single sequence version. In the web form (index.php), the only change will be the name of the text-area, namely “fasta_sequence” will be changed to “fasta_sequences”. The id of the text-area and the “for” attribute of the text-area label will also be adjusted to this new value.

index.php (batch version)


        <?php echo file_get_contents("html/header.html"); ?>
        <form action="script.php" method="POST" enctype="multipart/form-data" id="revcomp_form">
            <fieldset id="data">
                <legend>Data</legend>
                <p>
                    <label for="fasta_sequences">Your DNA sequence(s). Supports FASTA sequences in batch</label><br>
                </p>
                <p>
                    <textarea name="fasta_sequences" id="fasta_sequences"></textarea>
                </p>
            </fieldset>
            <fieldset id="options">
                <legend>Options</legend>
                <p>
                    <span class="field-title" style="margin-bottom:10px;">Select transformation</span>
                </p>
                <p>
                    <input type="radio" name="transformation" value="rev" id="rev"> <label for="rev" class="radio">Reverse</label><br>
                    <input type="radio" name="transformation" value="comp" id="comp"> <label for="comp" class="radio">Complement</label><br>
                    <input type="radio" name="transformation" value="revcomp" id="revcomp" checked> <label for="revcomp" class="radio">Reverse-complement</label>
                </p>
            </fieldset>
            <input type="submit" value="Go!">
        </form>
        <?php echo file_get_contents("html/footer.html"); ?>

<?php echo file_get_contents("html/header.html"); ?>

<p>

<label for="fasta_sequences">Your DNA sequence(s). Supports FASTA sequences in batch</label><br>

</p>

<p>

</p>

</fieldset>

<legend>Options</legend>

<p>

<span class="field-title" style="margin-bottom:10px;">Select transformation</span>

</p>

<p>

<input type="radio" name="transformation" value="rev" id="rev"> <label for="rev" class="radio">Reverse</label><br>

<input type="radio" name="transformation" value="comp" id="comp"> <label for="comp" class="radio">Complement</label><br>

<input type="radio" name="transformation" value="revcomp" id="revcomp" checked> <label for="revcomp" class="radio">Reverse-complement</label>

</p>

</fieldset>

</form>

<?php echo file_get_contents("html/footer.html"); ?>

In the functions file we replace process_fasta() with fasta_sequences_to_array().

functions.php (batch version)


<?php
function fasta_sequences_to_array($fasta_sequences){ // Takes a variable with FASTA sequences as input
    $lines = preg_split("/\n/", $fasta_sequences); // Individual lines to an array
    $seqs_array = array(); 
    $sequence = '';
    $header_line = '';
    $i = 0;
    
    foreach($lines as $line){
        if(preg_match("/^>/",$line)){
            if($i != 0){
                $seqs_array[] = array($header_line,$sequence);
                $sequence = '';
            }
            $header_line = trim($line);
            $i++;
        }
        elseif($line != ''){
            $sequence .= strtoupper(trim($line));
        }    
    }
    $seqs_array[] = array($header_line,$sequence);   
    return $seqs_array;
}

function revcomp($sequence, $mode="revcomp"){
    $complement_dict = array(
    "A" => "T",
    "T" => "A",
    "G" => "C",
    "C" => "G",
    "U" => "A",
    "R" => "Y",
    "Y" => "R",
    "S" => "W",
    "W" => "S",
    "K" => "M",
    "M" => "K",
    "B" => "D",
    "D" => "B",
    "H" => "V",
    "V" => "H",
    "N" => "N",
    "." => ".",
    "-" => "-"
    );

    $nucleotides = str_split($sequence,1);
    // Let's compute the complement sequence first
    $complement_sequence = "";
    foreach($nucleotides as $nucleotide){
        $complement_sequence = $complement_sequence.$complement_dict[$nucleotide];
    }
    // The complement sequence is now stored in the $complement_sequence variable
 
    $revcomp_sequence = strrev($complement_sequence); // This is the reverse complement sequence
    $reverse_sequence = strrev($sequence); // This is the reverse sequence

    // We return different things depending on the $mode (second optional argument of this function)
    // if we call the function with just one argument, the value of $mode will be the default, "revcomp"
    // additional supported values for the $mode argument are "comp" and "rev", see below
    // Note that when a function returns, it also exits, no more code inside the function is executed

    if($mode == "revcomp"){
        return $revcomp_sequence;
    }
    elseif($mode == "comp"){
        return $complement_sequence;
    }
    elseif($mode == "rev"){
        return $reverse_sequence;
    }
    else{  // This part may help us in debugging code in which the function is used
        return "WARNING: revcomp mode not supported";
    }
}

function seqbreak($sequence, $brlen=80, $brel="<br>\n"){ // $brel => breaking element
    $chars = str_split($sequence, 1);
    $i = 1;
    $out = "";
    foreach($chars as $char){
        if(is_int($i/$brlen)){
            $out = $out.$char.$brel;
        }
        else{
            $out = $out.$char;
        }
        $i++;
    }
    return $out;
}
?>

<?php

function fasta_sequences_to_array($fasta_sequences){ // Takes a variable with FASTA sequences as input

$lines = preg_split("/\n/", $fasta_sequences); // Individual lines to an array

$seqs_array = array();

$sequence = '';

$header_line = '';

$i = 0;

foreach($lines as $line){

if(preg_match("/^>/",$line)){

if($i != 0){

$seqs_array[] = array($header_line,$sequence);

$sequence = '';

}

$header_line = trim($line);

$i++;

}

elseif($line != ''){

$sequence .= strtoupper(trim($line));

}

$seqs_array[] = array($header_line,$sequence);

return $seqs_array;

}

function revcomp($sequence, $mode="revcomp"){

$complement_dict = array(

"A" => "T",

"T" => "A",

"G" => "C",

"C" => "G",

"U" => "A",

"R" => "Y",

"Y" => "R",

"S" => "W",

"W" => "S",

"K" => "M",

"M" => "K",

"B" => "D",

"D" => "B",

"H" => "V",

"V" => "H",

"N" => "N",

"." => ".",

"-" => "-"

);

$nucleotides = str_split($sequence,1);

// Let's compute the complement sequence first

$complement_sequence = "";

foreach($nucleotides as $nucleotide){

$complement_sequence = $complement_sequence.$complement_dict[$nucleotide];

}

// The complement sequence is now stored in the $complement_sequence variable

$revcomp_sequence = strrev($complement_sequence); // This is the reverse complement sequence

$reverse_sequence = strrev($sequence); // This is the reverse sequence

// We return different things depending on the $mode (second optional argument of this function)

// if we call the function with just one argument, the value of $mode will be the default, "revcomp"

// additional supported values for the $mode argument are "comp" and "rev", see below

// Note that when a function returns, it also exits, no more code inside the function is executed

if($mode == "revcomp"){

return $revcomp_sequence;

}

elseif($mode == "comp"){

return $complement_sequence;

}

elseif($mode == "rev"){

return $reverse_sequence;

}

else{ // This part may help us in debugging code in which the function is used

return "WARNING: revcomp mode not supported";

}

function seqbreak($sequence, $brlen=80, $brel="<br>\n"){ // $brel => breaking element

$chars = str_split($sequence, 1);

$i = 1;

$out = "";

foreach($chars as $char){

if(is_int($i/$brlen)){

$out = $out.$char.$brel;

}

else{

$out = $out.$char;

}

$i++;

}

return $out;

}

And here is the script.

A line of the code may deserve some explanation.

When we get the sequences from the web form, we convert them in an array ($seqs_array) with this structure:

[(seq1 header, seq1 sequence),(seq2 header, seq2 sequence),(seq3 header, seq3 sequence), etc…]

with the fasta_sequences_to_array() function.

In one line of code, we transfer this information to a second array, in which both the headers and the sequences are modified.

More specifically, we append to each header a text ($t_txt) with ” – reverse”, ” – complement”, or ” – reverse-complement”, depending on the transformation selected by the user.

The sequences themselves are also modified according to the selected transformation. HTML tags are also added to the sequence. In particular a break tag is added every 80 nucleotides with the seqbreak() function and the whole sequence is embedded within a span tag with a “sequence” class, which has a font-family:courier in the CSS file.

All of this is done in a single line of code within a foreach cycle:


foreach($seqs_array as $seq_array){
    $seqs_array_t[] = array($seq_array[0]." - $t_txt", "<span class=\"sequence\">".seqbreak(revcomp(strtoupper($seq_array[1]), $transformation))."</span>");   
}

foreach($seqs_array as $seq_array){

$seqs_array_t[] = array($seq_array[0]." - $t_txt", "<span class=\"sequence\">".seqbreak(revcomp(strtoupper($seq_array[1]), $transformation))."</span>");

}

As it happens, to better understand the code, it should be read from right to left:

The sequence ($seq_array[1]) is converted to uppercase, the only characters set the revcomp() function understands
This uppercase sequence is passed as argument to revcomp() together with the selected transformation type ($transformation)
The sequence transformed by revcomp() is added a break tag every 80 nucleotides with seqbreak()
The sequence is then embedded within a span tag
The header ($seq_array[0]) is added the appropriate text accounting for the transformation
The transformed header and transformed and tagged sequence are the first and second element of an array
This two elements array is added to the transformed sequences array $seqs_array_t

Read the comments in the code to better understand the flow.

script.php (batch version)


<?php
include("include/functions.php"); // We include the functions.php file
$fasta_sequences = $_POST["fasta_sequences"]; // and grab the sequences submitted by the user through the web form

$transformation = $_POST["transformation"]; // This will be revcomp (web form default), rev or comp

$seqs_array = fasta_sequences_to_array($fasta_sequences);

$seqs_array_t = array(); // An empty array where the transformed sequences will be stored. t stays for "transformed"

// Let's have some proper text values at hand to replace comp, revcomp and rev
// in the user output
if($transformation == "revcomp"){
    $t_txt = "reverse-complement"; 
}
elseif($transformation == "rev"){
    $t_txt = "reverse";    
}
elseif($transformation == "comp"){
    $t_txt = "complement";    
}

foreach($seqs_array as $seq_array){ // We transfer the sequences information from the original array to the transformed array.
    // Since we are at it: we turn each sequence in uppercase, do the transformation, apply a break tag every 80 nucleotides
    // and embed everything in a span tag with the class "sequence" so that the sequences will be in courier monoscpace font
    // note how all of this is done sequentially in a single line of code
    $seqs_array_t[] = array($seq_array[0]." - $t_txt", "<span class=\"sequence\">".seqbreak(revcomp(strtoupper($seq_array[1]), $transformation))."</span>");   
}
 
 
echo file_get_contents("html/header.html"); // Writing the header HTML to the output page
 
echo "<h2>Transformed sequence(s)</h2>\n";
echo "<p><strong>Selected transformation:</strong> $t_txt</p>";

foreach($seqs_array_t as $seq_array_t){
    $header_t = $seq_array_t[0]; // Header transformed (_t)
    $seq_t = $seq_array_t[1]; // Sequence transformed (and also properly HTML tagged, we did that in the previous foreach cycle)
    echo "<p>$header_t<br>$seq_t</p>";
}
 
echo file_get_contents("html/footer.html"); // Writing the footer HTML to the output page
?>

<?php

include("include/functions.php"); // We include the functions.php file

$fasta_sequences = $_POST["fasta_sequences"]; // and grab the sequences submitted by the user through the web form

$transformation = $_POST["transformation"]; // This will be revcomp (web form default), rev or comp

$seqs_array = fasta_sequences_to_array($fasta_sequences);

$seqs_array_t = array(); // An empty array where the transformed sequences will be stored. t stays for "transformed"

// Let's have some proper text values at hand to replace comp, revcomp and rev

// in the user output

if($transformation == "revcomp"){

$t_txt = "reverse-complement";

}

elseif($transformation == "rev"){

$t_txt = "reverse";

}

elseif($transformation == "comp"){

$t_txt = "complement";

}

foreach($seqs_array as $seq_array){ // We transfer the sequences information from the original array to the transformed array.

// Since we are at it: we turn each sequence in uppercase, do the transformation, apply a break tag every 80 nucleotides

// and embed everything in a span tag with the class "sequence" so that the sequences will be in courier monoscpace font

// note how all of this is done sequentially in a single line of code

$seqs_array_t[] = array($seq_array[0]." - $t_txt", "<span class=\"sequence\">".seqbreak(revcomp(strtoupper($seq_array[1]), $transformation))."</span>");

}

echo file_get_contents("html/header.html"); // Writing the header HTML to the output page

echo "<h2>Transformed sequence(s)</h2>\n";

echo "<p><strong>Selected transformation:</strong> $t_txt</p>";

foreach($seqs_array_t as $seq_array_t){

$header_t = $seq_array_t[0]; // Header transformed (_t)

$seq_t = $seq_array_t[1]; // Sequence transformed (and also properly HTML tagged, we did that in the previous foreach cycle)

echo "<p>$header_t<br>$seq_t</p>";

}

echo file_get_contents("html/footer.html"); // Writing the footer HTML to the output page

The reverse-complement web application output, batch version

You may test the script live here.

Chapter Sections

[pagelist include=”1461″]

[siblings]

WORK IN PROGRESS ON CHAPTER 5!

Single sequence version

The code

Batch version

Chapter Sections

Leave a Reply Cancel reply