Select your sequence type:Our prediction ensemble has two different variations dependent on sequence type. See the descriptions below to determine which option best suits your data:
- Transcript: This option also takes a DNA sequence, but only translates the longest ORF on the forward strand.
- Protein: This option will accept a protein sequence.
Submit your sequence(s):You will need your sequence(s) in FASTA format. This can either be in a file or you may paste it into the submission box. If you are unfamiliar with FASTA format please see the example below:
Currently, there is no limit to sequence length or number of sequences in a file, but larger files will take longer to process.
Choose your prediction tools:By default, GPCR-PEn will use the available tool Log-Reg. If you would like to use a combination of tools only select which ones you prefer.
Tool Specific Parameter Selection:The default parameters used by GPCR-PEn are the ones used in [insert citation to Synganglion paper here]. However, you may wish to alter the following parameters:
- Min. Length for Protein Coding Regions (default=100): This is the option which determines the minimum size of translated ORFs
- e-value cutoff: This parameter is the number of expected sequences with an equal or better score one would find in a database of given size by chance. Consequently, this means that the lower the e-value, the better the hit. The default value for GPCR-PEn is 1e-05. However, you can make your BLAST/PFAM search more stringent by lowering this parameter (i.e. 1e-20) or less stringent by increasing the parameter (i.e. 1).
- Full Length GPCR AA (default=234): The minimum length threshold to consider a GPCR sequence full length
- Full Length GPCR Number of Predicted Helices (default=6): The minimum number of predicted helices to consider a GPCR full length
- Short Length GPCR AA (default=100): Minimum length threshold for partial sequence GPCRs
- Short Length GPCR Number of Predicted Helices (default=3): The minimum number of predicted helices to consider a GPCR short length
DNA SEQUENCE(S) WORKFLOW
PROTEIN SEQUENCE(S) WORKFLOW