Lecture for ChIP-seq: Motif Analysis

CmhaDSO ChIP-seq

Go to MEME Suite
Go to STREME
Go to SEA
Go to CentriMo
Go to FIMO
Go to TomTom
Go to XSTREME and MEME-ChIP
Go to HOMER
Go to Motief Database
Go to References

MEME Suite

Motif Discovery
Motif Enrichment
- SEA
- CentriMo
Motif Scanning
- FIMO
Motif Comparison
- TomTom

MEME format is described here. Minimal example is available from here.

STREME

STREME discovers ungapped motifs (recurring, fixed-length patterns) that are enriched in your sequences or relatively enriched in them compared to your control sequences.

NOTE: If you have fewer than 50 sequences, you might want to use MEME instead of STREME.

COMMAND LINE: Get sample data ("Klf1.fna"; multi-fasta format) from here .

  streme --dna --text --p Klf1.fna > streme.txt

streme.txt (output) is a MEME format text file. This files can be used directly as input to other MEME Suite programs.

The E-value is an accurate estimate of the statistical significance of the motif as long as the length distributions of the positive and negative sequences are essentially the same. The E-value is the p-value multiplied by the number of motifs reported by STREME. It is an estimate of the number of motifs that would be found with enrichment as high as this motif in shuffled versions of your positive sequences.

FOR MORE DETAIL: visit web site and paper.

SEA

SEA (Simple Enrichment Analysis) identifies known or user-provided motifs that are relatively enriched in your sequences compared with shuffled sequences or your control sequences.

SEA applies the same objective function used by the STREME motif discovery algorithm to measure the enrichment of motifs.

COMMAND LINE: Get sample data ("Klf1.fna"; multi-fasta format) from here and Motif data (MEME format) from Motif database.

  sea --text --p Klf1.fna --m MOTIF1.meme --m MOTIF2.meme ... > sea.tsv

FOR MORE DETAIL: visit web site and paper.

CentriMo

CentriMo identifies known or user-provided motifs that show a significant preference for particular locations in your sequences. CentriMo can also show if the local enrichment is significant relative to control sequences.

FOR MORE DETAIL: visit web site and paper.

FIMO

FIMO (Find Individual Motif Occurrences) scans a set of sequences for individual matches to each of the motifs you provide.

The program searches a set of sequences for occurrences of known motifs, treating each motif independently. Motifs must be in MEME Motif Format.

FOR MORE DETAIL: visit web site and paper.

TomTom

Tomtom compares one or more motifs against a database of known motifs (e.g., JASPAR). Tomtom will rank the motifs in the database and produce an alignment for each significant match.

A query motif (qry.meme) is like this.

  tomtom --text qry.meme MOTIF1.meme MOTIF2.meme ...

FOR MORE DETAIL: visit web site and paper.

XSTREME and MEME-ChIP

XSTREME performs comprehensive motif analysis on sequences where the motif sites can be anywhere in the sequences. The input sequences may be of any length, and their lengths may vary.

XSTREME will:

Discover novel motifs in the input sequences (with STREME and MEME).
Determine which motifs are most enriched (with SEA).
Analyze them for similarity to known motifs (with Tomtom).
Group significant motifs by similarity.
Create a GFF file for viewing each motif's predicted sites in a genome browser.

MEME-ChIP performs comprehensive motif analysis on sequences where the motif sites tend to be centrally located, such as ChIP-seq peaks. The input sequences should be centered on a 100 character region expected to contain motifs, and each sequence should ideally be around 500 letters long.

MEME-ChIP will:

Discover novel motifs in the central regions (100 characters by default) of the input sequences (with MEME and STREME).
Determine which motifs are most centrally enriched (with CentriMo).
Analyze them for similarity to known motifs (with Tomtom).
Group significant motifs by similarity.
Perform a motif spacing analysis (with SpaMo).
Create a GFF file for viewing each motif's predicted sites in a genome browser.

MEME-ChIP ranks motifs by how enriched they are in the central regions of the input sequences. When it is believed (or known) that motifs may not be concentrated centrally in the sequences, XSTREME will provide more useful results than MEME-ChIP. Types of datasets where XSTREME is more appropriate than MEME-ChIP include sets of promoters, sets of accessible chromatin regions from ATAC-seq and Cut&Run datasets using TF antibodies.

FOR MORE DETAIL FOR XSTREME: visit web site and paper

FOR MORE DETAIL FOR MEME-ChIP: visit web site and paper

HOMER

Motif Database

From MEME Suite: Description| Download.
From JASPAR: Download

References