Protease activity analysis: A toolkit for analyzing enzyme activity data
- Ava P. Soleimany ,
- Carmen Martin-Alonso ,
- Melodi Anahtar ,
- Cathy S. Wang ,
- Sangeeta N. Bhatia
ACS Omega | , Vol 7(28): pp. 24292-24301
Analyzing the activity of proteases and their substrates is critical to defining the biological functions of these enzymes and to designing new diagnostics and therapeutics that target protease dysregulation in disease. While a wide range of databases and algorithms have been created to better predict protease cleavage sites, there is a dearth of computational tools to automate analysis of in vitro and in vivo protease assays. This necessitates individual researchers to develop their own analytical pipelines, resulting in a lack of standardization across the field. To facilitate protease research, here we present Protease Activity Analysis (PAA), a toolkit for the preprocessing, visualization, machine learning analysis, and querying of protease activity data sets. PAA leverages a Python-based object-oriented implementation that provides a modular framework for streamlined analysis across three major components. First, PAA provides a facile framework to query data sets of synthetic peptide substrates and their cleavage susceptibilities across a diverse set of proteases. To complement the database functionality, PAA also includes tools for the automated analysis and visualization of user-input enzyme–substrate activity measurements generated through in vitro screens against synthetic peptide substrates. Finally, PAA supports a set of modular machine learning functions to analyze in vivo protease activity signatures that are generated by activity-based sensors. Overall, PAA offers the protease community a breadth of computational tools to streamline research, taking a step toward standardizing data analysis across the field and in chemical biology and biochemistry at large.