PSIPRED, how to install and make it work?

2 min read 05-09-2024
PSIPRED, how to install and make it work?


Getting PSIPRED Up and Running: A Beginner's Guide with Practical Examples

PSIPRED is a powerful tool for predicting protein secondary structure, but setting it up can be a bit tricky for newcomers. This article will guide you through the installation and usage of PSIPRED, focusing on common pitfalls and providing practical solutions. We'll draw inspiration from real questions and answers on Stack Overflow to ensure a comprehensive and beginner-friendly guide.

Understanding the Setup Process

PSIPRED relies on a combination of programs and databases to function correctly. The essential steps include:

  1. Installing PSIPRED: This involves downloading and compiling the source code.
  2. Setting up BLAST+: PSIPRED uses BLAST+ for sequence alignment, so it needs to be installed.
  3. Preparing the UniRef90 database: This database is used to train the PSIPRED model and requires specific formatting.

Common Errors and Solutions

1. "pfilt not found" and "formatdb not found"

  • Cause: These are commands from the NCBI BLAST+ toolkit.
  • Solution: Install the NCBI BLAST+ toolkit using your package manager. For Ubuntu, the command is:
    sudo apt install ncbi-blast+
    

2. "/usr/local/bin/psiblast: Command not found"

  • Cause: The psiblast executable is part of BLAST+ and might not be in your PATH environment variable.
  • Solution: Add the BLAST+ installation directory to your PATH. This might vary depending on your installation, but usually the executable is in /usr/bin/psiblast. You can add this to your PATH using:
    export PATH=$PATH:/usr/bin
    
    This needs to be done every time you open a new terminal. For a permanent solution, edit your .bashrc or .profile file and add the line above.

3. Missing pfilt command

  • Cause: PSIPRED uses pfilt to filter the UniRef90 database. pfilt is not installed by default with NCBI BLAST+.
  • Solution: You can find the pfilt script in the PSIPRED source code directory. You might need to adjust the path accordingly.
    ./psipred/bin/pfilt uniref90.fasta > uniref90filt
    

4. Missing formatdb command

  • Cause: formatdb is another command from the BLAST+ toolkit, used to format the database for fast searching.
  • Solution: Ensure you have installed NCBI BLAST+ as described above. Then use formatdb:
    formatdb -t uniref90filt -i uniref90filt
    

Running PSIPRED with BLAST+

Once you have all the prerequisites set up, you can finally run PSIPRED:

./BLAST+/runpsipredplus example/example.fasta

Important Notes

  • Make sure you have the correct paths for your BLAST+ installation and PSIPRED directory.
  • PSIPRED requires a FASTA file as input, containing the protein sequence you want to analyze.
  • The runpsipredplus script is a wrapper that uses PSIPRED and BLAST+ to predict secondary structure.

Additional Resources:

Conclusion

Installing and running PSIPRED involves careful setup and attention to detail. By understanding the dependencies and following the steps outlined in this guide, you can successfully use this powerful tool for protein secondary structure prediction.