ACCORDION (ACCelerating and Optimizing model RecommenDatIONs) is a novel tool and methodology for rapid model assembly by automatically extending and evaluating dynamic network models with the information published in literature. This facilitates information reuse and data reproducibility and replaces hundreds or thousands of manual experiments, thereby reducing the time needed for the advancement of knowledge.
- Functionality
- I/O and Parameters
- Online Web-based Usage
- Offline Installation
- Package Structure
- Case Study: T cell differentiation
- Citation
- Funding
- Support
An automated framework for clustering and selecting relevant data for guided network extension and query answering. More specifically, answering biological questions by automatically assembling new, or expanding existing models using published literature.
- Clustering: creating groups of interactions, related to an existing model
- Extension: adding different groups of extensions to the model and evaluating the model behavior using defined properties
- A .xlsx file containing the baseline model to extend, in the BioRECIPE model format, see
examples/input/BaselineModel_Tcell.xlsx
- Machine reading output file, in the BioRECIPE interaction format, it has potential interactions that could be added to baseline model, see
examples/input/CandidateEvents_Tcell.csv
- Property files containing the property expression based on BLTL syntax, they are the golden properties that the extended models should satisfy, see them at directory
examples/input/Properties_Tcell/p1
- Inflation parameter for Markov Clustering, defined in Cell 14 of the notebook
- Number of return paths, defined in Cell 19 of the notebook
examples/output/abc_model
, network edges with only baseline model interactionsexamples/output/abc_model_network
, network edges with baseline model and machine reading output network
examples/output/markov_cluster
, a cluster dictionary that contains individual clusters, each line shows one cluster, formed by element in the baseline model or machine reading output
examples/output/grouped_ext
, a pickle file containing grouped (clustered) extensions, specified as nested lists. Each group starts with an integer, followed by interactions specified as [regulator element, regulated element, Interaction type: Activation (+) or Inhibition (-)]. This file along with the directory of system properties will be the input to the statistical model checking to verify the behavior of candidate models against the propertiesexamples/output/grouped_ext_Merged
, a pickle file containing the merged clusters (different than grouped_ext which is not merged), clusters are merged based on user-selected number of return pathsexamples/output/BaselineModel_Tcell_Extension_Candidate_1.xlsx
, a new .xlsx file containing the resulting candidate extended model, this is just one candidate extension and there could be many candidatesexamples/output/BaselineModel_Tcell_Extension_Candidate_2.xlsx
, a new .xlsx file containing another resulting candidate extended model, same as above
examples/checking
, containing model checking results of the resulting extended model against propertiesexamples/traces
, containing trace files generated from simulating the resulting extended model, required by statistical model checking
Run the demonstrated example (read the comments in each cell and uncomment some to choose your case study); or alternatively upload user-customized input files (see I/O and Parameters) to the input/ directory on File Browser Tab (upper left corner) of Binder.
- Become familiar with and parse the input files including baseline model spreadsheet and machine reading extracted events.
- Cluster events into groups, generate extension candidates and possibly merge some candidates.
- Modify the baseline model spreadsheet according to extension candidates.
- Test new model files against system properties and obtain model checking results that will help modelers choose the best extended model from the set of available candidate models.
-
Clone the ACCORDION repository to your computer.
git clone https://github.com/pitt-miskov-zivanov-lab/ACCORDION.git
-
Navigate into the directory, install ACCORDION and its python dependencies and two non-python dependencies.
cd ACCORDION python setup.py install
Check ReadTheDocs page of this ACCORDION tool for more detailed installtion instructions and debugging suggestions, MacOS/Linux users have alternative way to build non-python packages using managers like conda, brew, apt, Windows users may need Cygwin installation to compile.
-
Run the provided notebook (Check Jupyter notebook installation here).
jupyter notebook examples/use_ACCORDION.ipynb
setup.py
: python file that help set up python dependencies installation and non-python package buildingsrc/
: directory that includes core python ACCORDION filessrc/runAccordion.py
: functions for extending discrete network models in the BioRECIPES tabular format using knowledge from literature, as well as adding different groups of extensions to the modelsrc/markovCluster.py
: contains the functions that creates and clusters the network of baseline model and machine reading output
dependencies/
: dependencies directory, containing gsl and MCL packages and model checking module (part of DySE framework)examples/
: directory that includes tutorial notebook and example inputs and outputsenvironment.yml
: environment file, required by BinderpostBuild
: path settings and compilation, used by Binderdocs/
: containing files supporting the repo's host on Read the Docssupplementary/
: containing supplementary files for paper manuscriptLICENSE.txt
: MIT LicenseREADME.md
: it's me!
- Input 1: The baseline model (BM) to be extended is given in
examples/input/BaselineModel_Tcell.xlsx
, with 62 elements, key information of this model is listed below:
Element Name | Positive Regulation Rule | Negative Regulation Rule | Levels | State List 0 |
---|---|---|---|---|
AKT | (PDK1,MTORC2) | AKT_OFF | 2 | 0 |
AKT_OFF | 2 | 0 | ||
AP1 | (FOS_DD,JUN) | 2 | 0 | |
CA | TCR | 2 | 0 | |
CD122 | 2 | 1 | ||
CD132 | 2 | 1 | ||
CD25 | FOXP3,(AP1,NFAT,NFKAPPAB),STAT5 | 2 | 0 | |
... | ... | ... | ... | ... |
TAK1 | PKCTHETA | 2 | 0 | |
TCR | TCR_LOW,TCR_HIGH | 2 | 0 | |
TCR_HIGH | 2 | 0 | ||
TCR_LOW | 2 | 0 | ||
TGFBETA | 2 | 0 | ||
TSC | AKT | 2 | 1 | |
FOXO1 | 2 | 1 |
Input 2: The candidate event (CE) set, represented as a set of signed directed edges, is given in examples/input/CandidateEvents_Tcell.csv
. The studied candidate set has 118 events, with key information as follows:
Regulator Name | Regulated Name | Sign | Paper IDs |
---|---|---|---|
AKT | CD4 | negative | PMC2275380 |
AKT | CTRL | negative | PMC2275380 |
TGFBETA | AKT | positive | PMC2275380 |
Foxp3 | Ctla4 | positive | PMC2275380 |
Foxp3 | Gpr83 | positive | PMC2275380 |
Pten | CD8 | positive | PMC3375464 |
PTEN | HSC | positive | PMC3375464 |
... | ... | ... | ... |
TCR | MEK1 | positive | PMC4418530 |
TCR | CK2 | positive | PMC4418530 |
MTORC2 | MTORC2 | positive | PMC4418530 |
CD28 | MTORC2 | positive | PMC4418530 |
IL2_EX | MTORC2 | positive | PMC4418530 |
IL2_R | MTORC2 | positive | PMC4418530 |
PI3K | PIP3 | positive | PMC4418530 |
- Markov cluster algorithm is applied to cluster the set of candidate event (with a user-defined inflation parameter of 2), 17 clusters are detected as follows:
Clusters | Interaction List |
---|---|
Cluster 1 | ['AKT', 'FOXO1', '-'], ['PTEN', 'AKT', '-'], ['MEK1_ext', 'PTEN', '+'], ['AKT', 'MEK1_ext', '-'], ['TBK1_ext', 'AKT', '-'], ['AKT', 'MAGI1_ext', '-'], ['FOXO1', 'PTEN', '+'], ['PIP3', 'AKT', '+'], ['PTEN', 'AKT', '+'], ['AKT', 'TBK1_ext', '+'], ['CHK1_ext', 'AKT', '+'], ['FOXO1', 'Foxo3a_ext', '-'], ['TBK1_ext', 'CD4_ext', '-'], ['MEK1_ext', 'AKT', '-'], ['AKT', 'FoxO3_ext', '-'], ['TBK1_ext', 'AKT', '+'], ['TGFBETA', 'AKT', '+'], ['IFNgamma_ext', 'AKT', '+'], ['CK2_ext', 'AKT', '+'], ['AKT', 'CD4_ext', '-'], ['Itk_ext', 'CD4_ext', '+'], ['CTLA4_ext', 'AKT', '-'], ['CD4_ext', 'IL17A_ext', '-'], ['TCR', 'MEK1_ext', '+'], ['PDK1', 'AKT', '+'], ['CK2_ext', 'CD4_ext', '+'], ['AKT', 'CTRL_ext', '-'], ['AKT', 'Itk_ext', '-'], ['MTOR', 'TBK1_ext', '+'], ['PI3K', 'AKT', '-'], ['TCR', 'CD4_ext', '-'], ['TBK1_ext', 'FOXO1', '+'], ['TIL_ext', 'AKT', '-'], ['Bcl2l11_ext', 'CD4_ext', '+'], ['TBK1_ext', 'CD4_ext', '+'], ['PI3K', 'AKT', '+'], ['MTORC2', 'AKT', '+'], ['MTOR', 'AKT', '+'], ['PD1_ext', 'AKT', '-'], ['AKT', 'MTORC2', '-'], ['SHIP1_ext', 'AKT', '-'], ['TCR', 'AKT', '-'], ['Itk_ext', 'CD4_ext', '-'] |
Cluster 2 | ['TCR', 'MTORC2', '+'], ['TCR', 'CK2_ext', '+'], ['TCR', 'Itk_ext', '+'], ['TCR', 'CD25', '+'], ['TCR', 'PTEN', '-'], ['CK2_ext', 'PTEN', '-'], ['P53_ext', 'PTEN', '-'], ['PI3K', 'PTEN', '+'], ['TCR', 'NEDD4_ext', '+'], ['PTEN', 'HSC_ext', '+'], ['MEK2', 'PTEN', '+'], ['NEDD4_ext', 'PTEN', '-'], ['PI3K', 'PIP3', '+'], ['PTEN', 'PIP3', '-'], ['PTEN', 'Itk_ext', '-'], ['RAS', 'PTEN', '-'], ['PI3K', 'PTEN', '-'], ['TCR', 'PIP3', '+'], ['PTEN', 'CD8_ext', '+'] |
Cluster 3 | ['CD25', 'MTORC2', '+'], ['FOXP3', 'Ctla4_ext', '+'], ['FOXP3', 'Gpr83_ext', '+'], ['FOXP3', 'Itk_ext', '+'], ['IL2', 'Itk_ext', '+'], ['IL2', 'MTORC2', '+'] |
… | … |
Cluster 15 | ['ERK', 'S5B_ext', '+'] |
Cluster 16 | ['FASL_ext', 'FAS_ext', '+'] |
Cluster 17 | ['HIF1alpha_ext', 'IL17A_ext', '-'] |
- We now extend the baseline model to include candidate event set and obtain 17 candidate models, see two examples at
examples/output/BaselineModel_Tcell_Extension_Candidate_1.xlsx
andexamples/output/BaselineModel_Tcell_Extension_Candidate_2.xlsx
. The first extended candidate model now has 79 elements (compared to 62 of baseline model), and its top seven rows are now as follows:
Element Name | Positive Regulation Rule | Negative Regulation Rule | Levels | State List 0 |
---|---|---|---|---|
AKT | (PDK1,MTORC2),PIP3,PTEN,CHK1_ext, TBK1_ext,TGFBETA,IFNgamma_ext, CK2_ext,PDK1,PI3K,MTORC2,MTOR |
AKT_OFF,PTEN,TBK1_ext,MEK1_ext, CTLA4_ext,PI3K,TIL_ext,PD1_ext, SHIP1_ext,TCR |
2 | 0 |
AKT_OFF | 2 | 0 | ||
AP1 | (FOS_DD,JUN) | 2 | 0 | |
CA | TCR | 2 | 0 | |
CD122 | 2 | 1 | ||
CD132 | 2 | 1 | ||
CD25 | FOXP3,(AP1,NFAT,NFKAPPAB),STAT5 | 2 | 0 |
For example, the extension is obvious that regulators for AKT
are significantly updated.
We omit to show the remaining candidate models, but all 17 of them are listed under the directory of examples/output/
. Under different parameters, total number of candidate models may vary.
- Statistical model checking is run on the candidate models, against golden properties. The properties are extracted from golden models, indicating the golden behavior that a model expect to satisfy under certain scenario. For example, we test five candidate models against four properties stored in
examples/input/Properties_Tcell/p1/
. A property match table summarizes all the probabilities. Users are able to choose the preferred candidate model by the criterion of general satisfication or priortized satisfication, subject to certain application scenario.
Property 1d | Property 1c | Property 1b | Property 1a | |
---|---|---|---|---|
Extension_Candidate_1 | 0.79096 | 0.956522 | 0.956522 | 0.956522 |
Extension_Candidate_2 | 0.956522 | 0.956522 | 0.956522 | 0.956522 |
Extension_Candidate_3 | 0.956522 | 0.956522 | 0.956522 | 0.956522 |
Extension_Candidate_4 | 0.956522 | 0.956522 | 0.956522 | 0.956522 |
Extension_Candidate_5 | 0.956522 | 0.956522 | 0.956522 | 0.956522 |
- Two other case studies of TLGL and PCC are available, users can edit and uncomment cell 7, 10, 14, 23, 26, 31, 32 to play with other case studies.
Yasmine Ahmed, Cheryl Telmer, Gaoxiang Zhou, Natasa Miskov-Zivanov, “Context-aware knowledge selection and reliable model recommendation with ACCORDION”, bioRxiv preprint, doi: https://doi.org/10.1101/2022.01.22.477231.
This work was funded in part by DARPA Big Mechanism award, AIMCancer (W911NF-17-1-0135); and in part by the University of Pittsburgh, Swanson School of Engineering.
For installation and reproducibility concerns, feel free to reach out to Natasa Miskov-Zivanov: nmzivanov@pitt.edu