pyseqlogo package

Submodules

pyseqlogo.cli module

Console script for pyseqlogo.

pyseqlogo.colorschemes module

pyseqlogo.colorschemes.cbb_palette = {'black': '#000000', 'blue': '#0072B2', 'green': '#009E73', 'magenta': '#CC79A7', 'orange': '#E69F00', 'red': '#D55E00', 'skyblue': '#56B4E9', 'yellow': '#F0E442'}

Cinema Multiple sequence alignment program (Parry-Smith, D.J., Payne, A.W.R, Michie, A.D. and Attwood, T.K. (1997) “CINEMA - A novel Colour INteractive Editor for Multiple Alignments.” Gene, 211(2), GC45-56)

blue: polar positive H, K, R red: polar negative D, E green: polar neutral S, T, N,Q white: non-polar aliphatic A, V, L, I, M purple: non-polar aromatic F, Y, W brown: P, G yellow: C

pyseqlogo.colorschemes.cinema = {'black': ['A', 'V', 'I', 'L', 'M'], 'blue': ['H', 'K', 'R'], 'green': ['S', 'T', 'N', 'Q'], 'purple': ['F', 'Y', 'W'], 'red': ['D', 'E'], 'yellow': ['C']}

Lesk Lesk A.M. (2002) Introduction to Bioinformatics, Oxford University Press, page 187 yellow: small nonpolar G, A, S, T green: hydrophobic C, V, I, L, P, F, Y, M, W magenta: polar N, Q, H red: negatively charged D, E blue: positively charged K, R

pyseqlogo.colorschemes.lesk = {'blue': ['K', 'R'], 'green': ['C', 'V', 'U', 'L', 'P', 'F', 'Y', 'M', 'W'], 'red': ['D', 'E'], 'yellow': ['G', 'A', 'S', 'T']}

Mod. Cinema Modified Cinema blue: polar positive H, K, R red: polar negative D, E green: hydrophobic V, L, I, M dark green: non-polar aromatic F, Y, W purple: polar neutral S, T, N,Q light blue A, P, G, C

pyseqlogo.expected_frequencies module

pyseqlogo.format_utils module

pyseqlogo.format_utils.approximate_error(pfm, n_occur)[source]

Calculate approximate error for small count motif information content

Parameters:

pfm: dict

{‘A’: [0.1,0.3,0.2], ‘T’:[0.3,0.1,0.2], ‘G’: [0.1,0.3,0.3], ‘C’:[0.5,0.3,0.3]}

n: int

Number of sites

Returns:

approx_error: float

Approx error

pyseqlogo.format_utils.calc_info_matrix(pfm, n_occur, correction_type='approx', seq_type='dna')[source]

Calculate information matrix with small sample correction

pyseqlogo.format_utils.calc_relative_information(pfm, n_occur, correction_type='approx')[source]

Calculate relative information matrix

pyseqlogo.format_utils.count_to_pfm(counts)[source]
pyseqlogo.format_utils.create_motif_from_alignment(alignment)[source]

Create motif form an alignment object

Parameters:

alignment : Bio.AlignIO

Bio.AligIO input

Returns:

motif : Bio.motifs object

pyseqlogo.format_utils.exact_error(pfm, n)[source]

Calculate exact error, using multinomial(na,nc,ng,nt)

pyseqlogo.format_utils.format_matrix(matrix)[source]
pyseqlogo.format_utils.process_data(data, data_type='counts', seq_type='dna')[source]
pyseqlogo.format_utils.read_alignment(infile, data_type='fasta', seq_type='dna', pseudo_count=1)[source]

Read alignment file as motif

Parameters:

infile: str

Path to input alignment file

data_type: str

‘fasta’, ‘stockholm’, etc/. as supported by Bio.AlignIO

seq_type: str

‘dna’, ‘rna’ or ‘aa’

pseudo_count: int

psuedo counts to add before calculating information cotent

Returns:

(motif, information_content) : tuple

A motif instance followd by total informatio content of the motif

pyseqlogo.utils module

pyseqlogo.utils.aggregate_motif_ic(ic)[source]

Return per base motif information content

pyseqlogo.utils.approximate_error(pfm, n_occur)[source]

Calculate approximate error for small count motif information content

Parameters:

pfm: dict

{‘A’: [0.1,0.3,0.2], ‘T’:[0.3,0.1,0.2], ‘G’: [0.1,0.3,0.3], ‘C’:[0.5,0.3,0.3]}

n: int

Number of sites

Returns:

approx_error: float

Approx error

pyseqlogo.utils.calc_info_matrix(pfm, n_occur, correction_type='approx')[source]

Calculate information matrix with small sample correction

pyseqlogo.utils.calc_pfm(counts)[source]

Calculat pfm given counts

pyseqlogo.utils.calc_relative_information(pfm, n_occur, correction_type='approx')[source]

Calculate relative information matrix

pyseqlogo.utils.despine(fig=None, ax=None, top=True, right=True, left=False, bottom=False, offset=None, trim=False)[source]

Remove the top and right spines from plot(s). Parameters ———-

fig : matplotlib figure, optional
Figure to despine all axes of, default uses current figure.
ax : matplotlib axes, optional
Specific axes object to despine.
top, right, left, bottom : boolean, optional
If True, remove that spine.
offset : int or dict, optional
Absolute distance, in points, spines should be moved away from the axes (negative values move spines inward). A single value applies to all spines; a dict can be used to set offset values per side.
trim : bool, optional
If True, limit spines to the smallest and largest major tick on each non-despined axis.
Returns:None
pyseqlogo.utils.exact_error(pfm, n)[source]

Calculate exact error, using multinomial(na,nc,ng,nt)

pyseqlogo.utils.load_motif(infile=None, counts=None)[source]

Load motifs file

pyseqlogo.utils.max_motif_ic(ic)[source]

“Return per base max info

pyseqlogo.utils.pfm_to_tuple(pfm)[source]

Convert a dict of pwm basewise to a list of tuples

pyseqlogo.wigoperations module

class pyseqlogo.wigoperations.WigReader(wig_location)[source]

Bases: object

Class for reading and querying wigfiles.

get_chromosomes

Return list of chromsome and their sizes as in the wig file.

Returns:

chroms : dict

Dictionary with {“chr”: “Length”} format

query(intervals)[source]

Query regions for scores.

Parameters:

intervals : list(tuple)

A list of tuples with the following format:

(chr, chrStart, chrEnd, strand)

Returns:

scores : array_like

A numpy array containing scores for each tuple