Rather than sitting on my hard disk getting dusty, I thought I should start publishing the bioinformatics scripts that I’ve written over the past few years of my PhD.
The first to go up is a Perl script called “Genome RE Sites” - it searches a genome of your choice for a restriction endonuclease recognition site and outputs the co-ordinates of all cut sites.
[ Update ] : You can find an online version of this tool here.
I use a technique called Chromosome Conformation Capture, which uses restriction enzymes, so I frequently need to generate a list of cut sites to help me analyse data.
This script needs to be run on the command line, either on a linux system or using something like ActivePerl on Windows. Usage of this script is:
perl genome_RE_sites.pl [output file] [search string]
..where [output file] is the filename that will hold your results and [search string] is the restriction enzyme recognition site. The latter can be ignored and will default to HindIII (AAGCTT). There are a few other configuration options that you’ll need to edit before using the script, such as location of downloaded genome sequences and chromosome names. Please leave a comment if you have any problems with these.
Download the code here, or copy and paste from below: