5-1: The color sequence web application

In this section we will build a web application able to color amino-acids in a protein sequence according to their nonpolar, polar, basic or acidic nature (see figure 4-7-2). The application will provide some simple statistics on the protein sequence composition in regard to those classes of amino-acids. These data will be represented graphically by building an HTML based graph.

The web form will be very simple, with just a text-area for the sequence field and a submit button.

The web form for the color sequence script with a sample sequence in the text-area

The PHP code to handle the sequence for amino-acids classification will be based on the amino-acids classification script we saw in section 4-7.

The script will support sequences in FASTA format. To handle FASTA sequences we will use the process_fasta() function that we wrote in section 4-12.

This is what the output of the script will look like:

The output of the color sequence script
The output of the color sequence script

Coloring amino-acids

In order to color amino-acids in the script output we embed each amino-acid letter in a span tag with the appropriate class “nonpolar”, “polar”, “basic” or “acidic”:

In the stylesheet we assign a color to the class:

Counting amino-acids

To count the amino-acids in each class we define four counters, one for each class, before we start cycling through the amino-acids array. We also use an $i counter to count the total number of residues and keep track of where we stand during the cycle. $i starts at 1 while the “classes counters” start at 0.

Each time we find an amino-acid belonging to a particular class during the foreach cycle that iterates through the amino-acids array, the respective counter is incremented by one:

Once we have the total number of amino-acids and the number for each class, we can calculate percentages:

We limit the digits of the float number resulting from the division in the calculation by using the PHP round() function, that as the name implies, can round the float to a defined number of digits. The round() function takes the float as first argument and the number of desired digits as second argument. If no second argument is provided, the default of zero digits is used (the float is rounded to an integer).

Building a graph in HTML

To build a graph in HTML we use a description list, the dl tag.

The general syntax of a description list is as follows. In this example we have a description list with 3 items (Item 1, Item 2, Item 3) and their respective descriptions (Description 1, Description 2, Description 3):

This code as it is will yield the following (what you see below may be influenced by this website’s style sheet, try it on a page yourself):

Item 1
Description 1
Item 2
Description 2
Item 3
Description 3

In our graph, the items are the names of the amino-acids classes, that are set to float:left in the CSS, and the descriptions contain divs. The script sets the width attribute of each div as the percentage of amino-acids in the corresponding class. The background-color attribute of each div is set to the same color we have selected for the amino-acids class by assigning to the div a “bar-nonpolar”, “bar-polar”, “bar-basic” or “bar-acidic” class, check out the CSS stylesheet below.

This is an example of the resulting markup with a sample sequence. Again, mind that for the graph to look like the graph we get by running the script, CSS definitions for the various elements (dl, dd, dt and the bar classes assigned to the divs) are essential:

The script code

As for all the web applications in this book, the code for this application will be distributed across several files. The files structure will be similar to the one used for the code in the previous section, with the addition of an include folder containing a functions.php file where the process_fasta() function will be stored.

color_sequence
    index.php
    script.php
    html
        header.html
        footer.html
    css
        style.css
    include
        functions.php

header.html

footer.html

index.php

In the CSS file, with respect to the example in the previous section, we add four classes (nonpolar, polar, basic, acidic) for the four types of amino-acids and assign a different color to each class. We add an “undefined” class, for unexpected characters in the sequence, and a “sequence” class with font-family:courier that we will apply to the whole output sequence. We also add some styles for the description list elements (dl, dt and dd) and the divs (“bar” classes) used to generate the graph.

For more advanced graph generation with description list tags and CSS check out this tutorial on htmlgoodies.com.

For an uber-cool dynamic graph generated with javascript and the HTML5 canvas element, check out this page on the williammalone.com web site.

style.css

functions.php

script.php

You may test the script live here.

Chapter Sections

[pagelist include=”1461″]

[siblings]

WORK IN PROGRESS ON CHAPTER 5!

Leave a Reply

Your email address will not be published. Required fields are marked *