The Bio-Web: Resources for Molecular and Cell Biologists

The Bio-Web: Molecular and Cell Biology and Bioinformatics news, tools, books, resources and web applications development

DNA Protein sequence cleaner

In order to properly clean your DNA, RNA or protein sequence we need to know which alphabet the sequence is using. For instance "N" will be stripped out if you select a strict DNA alphabet, while it will remain if you select a IUPC ambiguous alphabet, where N exists and means "any nucleotide". It will also remain if you select a protein alphabet, where N means asparagine. Any character not belonging to any DNA, RNA or protein alphabet, such as punctuations, spaces, symbols, numbers and others will be always removed.

This application supports degenerated/ambiguous IUPAC characters.

ALPHABET SELECTION

Please select your sequence alphabet below

OUTPUT OPTIONS

Please select one of the options below for your output

UPPER CASE    lower case

Include numbering and line breaks every: nucleotides/residues (0 = no formatting)

INPUT SEQUENCE

Paste your DNA sequence below in any format


A web application written in Python by Andrea Cabibbo