Blixem - a graphical blast viewer


Blixem, which stands for "BLast matches In an X-windows Embedded Multiple alignment", is an interactive browser of pairwise Blast matches that have been stacked up in a "master-slave" multiple alignment. It is thus not a 'true' multiple alignment, such as produced by e.g. Clustalw, but a 'one-to-many' alignment. (For an interactive browser of true alignments, see Belvu.)


Installing and running Blixem

Blixem does not read Blast output directly, but uses a filtering program called MSPcrunch to parse the Blast output. This decoupling of filter and viewer functions was made so that MSPcrunch can be used with other viewers than Blixem, and so that Blixem and view data from other sources than MSPcrunch. The flow of data is:

BLAST -> MSPcrunch -> Blixem

Note that MSPcrunch can also produce output in a format suitable for ACEDB. ACEDB can then later call Blixem with this stored data.

See the file '=README' in the FTP distribution for instructions on installing Blixem and MSPcrunch. Both programs are available in the same FTP directory.

On the command line, you can run MSPcrunch and Blixem in one go by typing:

MSPcrunch -q < blast_output > | blixem < query_sequence > -

An easier way is to use the bundled script 'blx' and type:

blx < query_sequence > < blast_output >

Once up and running, it is important to know that Blixem uses the ACEDB GUI system. The main difference is that instead of a toolbar, there is a 'main menu', which is activated by pressing the right mouse button anywhere in the window. It is a good idea to press the "Help" button and read what the function of each mouse button is.


Description of Blixem's display panels

Blixem contains two main displays: the bottom display panel shows the actual alignment of proteins to the genomic DNA sequence and the top display shows the "Big Picture" of matches around the alignment window.


Fetching database annotation in Blixem

Annotation of a protein is fetched by double clicking on the sequence of interest in the bottom display. A program 'efetch' will then retrieve the record from an external database. If this doesn't work, either efetch or the database itself is not installed for efetch.

Using other sequence retrieval tools than efetch. If you want to use your own in-house retrieval system, you can make a script wrapper that simulates efetch. Efetch is called from blixem in two different modes: 1. To get the sequence only. 2. To get the annotation (after double click). In mode 1, blixem calls "efetch -q seqname", while in mode 2 it calls "efetch seqname".
Your efetch script wrapper must therefore check for the -q option. If it is used, it should return the raw sequence on one line only. If it is not, it should return the annotation as raw text on multiple lines.


Blixem's main menu:

(Press right mouse button anywhere)


Quit			Exit program.

Help			Get brief help.

Print			Print the currently displayed window.

Print whole alignment	Print all matches. May produce many pages.

Change Settings		Start the Settings tool.


Dotter			Do a dotter dotplot with last picked matching sequence.

Dotter HSPs only	Display blast matches in a Dotter dotplot.

Dotter query vs. itself	
itself.			Call Dotter for a region of the query vs.
			This is useful to analyse internal repeats etc.

Manual Dotter 
parameters		Use if Dotter estimates the start and end coordinates wrongly.

Automatic Dotter
parameters	 	To cancel the Manual mode.

Highlight sequences 
by name 		Define a template, e.g. *human, to highlight
			all human proteins.  This works on the name
			field in the leftmost column.  Note that the
			names start with a database prefix that is
			hidden in blixem, so it's a good idea to
			always start your template with the wildcard *.

Clear highlighted 
sequences		Reset all picked and highlighted sequences.


Blixem's settings menu / settings tool:

(press right mouse button pull-down on "Settings", or click once on it with left mouse button)

Sort by score		Sort all proteins with the highest-scoring first.

Sort by identity	Sort all proteins with the most identical first.

Sort by name		Sort all proteins alphabetically.

Sort by position	Sort all proteins with the most N-terminal first.


Big Picture		Toggle Big Picture (top display) on/off.

Big Picture Other 
Strand 			Toggle between single and double
			strand display in the Big Picture.

Highlight differences 	Shows identical residues as a dot (.) and
			draws mismatching residues in bright blue.

Squash matches		Forces all matches to one sequence onto one line.
			The start of a match is marked by a red vertical bar.

SEG low complexity
analysis 		Draws plots of low complexity at 3 different window sizes.
			These window sizes by default correspond to SEG's (Wooton)
			Stringent, Medium and Non-globular levels of low complexity.


Print colours		Good for printing on black/white printers.

No colours		Good if you want to colour printout it by hand.

Verbose mode		Toggles verbose mode and prints all HSPs to stdout.
(The Settings tool also allows other settings to be changed, such as colours and efetch method.)


  blixem seq foo &

References

Sonnhammer, ELL and Durbin, R (1994). "A workbench for Large Scale Sequence Homology Analysis". Comput. Applic. Biosci, 10:301-307.

Sonnhammer, ELL and Durbin, R (1994). "An expert system for processing sequence homology data". ISMB 2:363-368