Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make protein-ligand example charge-neutral #1175

Merged
merged 2 commits into from
Mar 5, 2025
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
127 changes: 90 additions & 37 deletions examples/protein_ligand/protein_ligand.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -345,23 +345,50 @@
"cell_type": "markdown",
"id": "30",
"metadata": {},
"source": [
"Before we move on to the solvent, we should check the partial charge of our protein-ligand complex. Since we want to run simulations with net neutral charge, we'll add counter-ions to balance the total charge of the protein-ligand complex. To be safe, we can also double-check that the total charge in the `docked_intrcg` object matches the sum of the formal charges of the `Molecule` representations which more closely describe our intended chemistry."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "31",
"metadata": {},
"outputs": [],
"source": [
"total_charge = round(sum(docked_intrcg[\"Electrostatics\"].charges.values()), 3)\n",
"\n",
"assert total_charge == protein.total_charge + ligand.total_charge, (\n",
" f\"Total charge of the system is {total_charge}, not {protein.total_charge + ligand.total_charge}\"\n",
")\n",
"total_charge"
]
},
{
"cell_type": "markdown",
"id": "32",
"metadata": {},
"source": [
"### Solvent component\n",
"\n",
"We'll reuse the Sage force field from earlier here, as it includes parameters for TIP3P water, but first we need coordinates for our solvated system. This is a portion of the OpenFF ecosystem that will be streamlined in the future, but we can use a PACKMOL wrapper to get the job done. We're adding a fixed amount of water for this quick example so the density will be wrong, but imagine it's right."
"We'll reuse the Sage force field from earlier here, as it includes parameters for TIP3P water, but first we need coordinates for our solvated system. This is a portion of the OpenFF ecosystem that will be streamlined in the future, but we can use a PACKMOL wrapper to get the job done. We're adding a fixed amount of water for this quick example so the density will be wrong, but imagine it's right.\n",
"\n",
"We'll also add three chloride ions to balance the net charge of the protein-ligand complex. For a production run, you'll probably want a more realistic salt concentration - the goal here is just to get something running with a net neutral charge."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "31",
"id": "33",
"metadata": {},
"outputs": [],
"source": [
"# Construct a water molecule\n",
"water = Molecule.from_smiles(\"O\")\n",
"water.generate_conformers(n_conformers=1)\n",
"\n",
"ion = Molecule.from_smiles(\"[Cl-]\")\n",
"ion.generate_conformers(n_conformers=1)\n",
"\n",
"# Come up with a box size based on the size of the protein plus a 2 nm buffer\n",
"xyz = protein.conformers[0]\n",
"centroid = xyz.sum(axis=0) / xyz.shape[0]\n",
Expand All @@ -372,8 +399,8 @@
"n_water = 1000\n",
"\n",
"packed_topology = pack_box(\n",
" molecules=[water],\n",
" number_of_copies=[n_water],\n",
" molecules=[water, ion],\n",
" number_of_copies=[n_water, int(total_charge.m)],\n",
" solute=docked_intrcg.topology,\n",
" box_vectors=box_vectors,\n",
")"
Expand All @@ -382,7 +409,7 @@
{
"cell_type": "code",
"execution_count": null,
"id": "32",
"id": "34",
"metadata": {},
"outputs": [],
"source": [
Expand All @@ -393,7 +420,7 @@
},
{
"cell_type": "markdown",
"id": "33",
"id": "35",
"metadata": {},
"source": [
"And now we can create the interchange! The `Topology` we got from `pack_box` includes the positions we'll later apply to the solvated complex. For now, we need an `Interchange` that represents the water component. We can pass it Sage, wihch contains TIP3P parameters, and a topology of `n_water` water molecules without worrying about the positions, since we'll just set those later."
Expand All @@ -402,27 +429,27 @@
{
"cell_type": "code",
"execution_count": null,
"id": "34",
"id": "36",
"metadata": {},
"outputs": [],
"source": [
"water_intrcg = Interchange.from_smirnoff(\n",
" force_field=ForceField(\"openff_unconstrained-2.0.0.offxml\"),\n",
" topology=[water] * n_water,\n",
" topology=[water] * n_water + [ion] * int(total_charge.m),\n",
")"
]
},
{
"cell_type": "markdown",
"id": "35",
"id": "37",
"metadata": {},
"source": [
"## Putting the system together"
]
},
{
"cell_type": "markdown",
"id": "36",
"id": "38",
"metadata": {},
"source": [
"Now that we've got all the pieces, we can combine the docked protein-ligand system with the solvent, and add in the positions and box vectors we just worked out\n",
Expand All @@ -436,7 +463,7 @@
{
"cell_type": "code",
"execution_count": null,
"id": "37",
"id": "39",
"metadata": {},
"outputs": [],
"source": [
Expand All @@ -448,7 +475,7 @@
{
"cell_type": "code",
"execution_count": null,
"id": "38",
"id": "40",
"metadata": {},
"outputs": [],
"source": [
Expand Down Expand Up @@ -476,23 +503,23 @@
},
{
"cell_type": "markdown",
"id": "39",
"id": "41",
"metadata": {},
"source": [
"## Exporting to simulation engines"
]
},
{
"cell_type": "markdown",
"id": "40",
"id": "42",
"metadata": {},
"source": [
"Finally, we can export the final Interchange object to models understood by various simulation engines. Some of these exports are not yet optimized for large files."
]
},
{
"cell_type": "markdown",
"id": "41",
"id": "43",
"metadata": {},
"source": [
"### OpenMM"
Expand All @@ -501,7 +528,7 @@
{
"cell_type": "code",
"execution_count": null,
"id": "42",
"id": "44",
"metadata": {},
"outputs": [],
"source": [
Expand All @@ -512,7 +539,7 @@
},
{
"cell_type": "markdown",
"id": "43",
"id": "45",
"metadata": {},
"source": [
"### Amber"
Expand All @@ -521,7 +548,7 @@
{
"cell_type": "code",
"execution_count": null,
"id": "44",
"id": "46",
"metadata": {},
"outputs": [],
"source": [
Expand All @@ -531,7 +558,7 @@
},
{
"cell_type": "markdown",
"id": "45",
"id": "47",
"metadata": {},
"source": [
"### LAMMPS"
Expand All @@ -540,7 +567,7 @@
{
"cell_type": "code",
"execution_count": null,
"id": "46",
"id": "48",
"metadata": {},
"outputs": [],
"source": [
Expand All @@ -551,7 +578,7 @@
},
{
"cell_type": "markdown",
"id": "47",
"id": "49",
"metadata": {},
"source": [
"### GROMACS"
Expand All @@ -560,7 +587,7 @@
{
"cell_type": "code",
"execution_count": null,
"id": "48",
"id": "50",
"metadata": {},
"outputs": [],
"source": [
Expand All @@ -569,7 +596,7 @@
},
{
"cell_type": "markdown",
"id": "49",
"id": "51",
"metadata": {},
"source": [
"## Energy tests\n",
Expand All @@ -582,7 +609,7 @@
{
"cell_type": "code",
"execution_count": null,
"id": "50",
"id": "52",
"metadata": {
"tags": []
},
Expand All @@ -593,7 +620,7 @@
},
{
"cell_type": "markdown",
"id": "51",
"id": "53",
"metadata": {},
"source": [
"For convenience there is a function `get_summary_data` that runs through all available engines and summarizes the results in a Pandas DataFrame. (This cell might take a minute to execute). We're setting the argument `_engines` to a non-defualt value so that the LAMMPS driver is skipped even if it's available; normally this argument is unnecessary if you don't have LAMMPS installed on your system."
Expand All @@ -602,7 +629,7 @@
{
"cell_type": "code",
"execution_count": null,
"id": "52",
"id": "54",
"metadata": {},
"outputs": [],
"source": [
Expand All @@ -612,7 +639,7 @@
},
{
"cell_type": "markdown",
"id": "53",
"id": "55",
"metadata": {},
"source": [
"We can see from these large energy differences that something is wrong - this stems from the experimental `Interchange` combination operation producing incorrect results."
Expand All @@ -621,7 +648,7 @@
{
"cell_type": "code",
"execution_count": null,
"id": "54",
"id": "56",
"metadata": {
"tags": []
},
Expand All @@ -632,7 +659,7 @@
},
{
"cell_type": "markdown",
"id": "55",
"id": "57",
"metadata": {},
"source": [
"In the future this should work more smoothly with identical energies reported by each engine. In lieu of that, we can evaluate the energy of each _component_ that we previously added together. This requires setting box vectors for each component and also setting the water positions, which we skipped earlier since we were able to use PACKMOL results directly."
Expand All @@ -641,7 +668,7 @@
{
"cell_type": "code",
"execution_count": null,
"id": "56",
"id": "58",
"metadata": {
"tags": []
},
Expand All @@ -650,13 +677,13 @@
"for subset in [ligand_intrcg, protein_intrcg, water_intrcg]:\n",
" subset.box = system_intrcg.box\n",
"\n",
"water_intrcg.positions = system_intrcg.positions[-3000:,]"
"water_intrcg.positions = system_intrcg.positions[-3003:,]"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "57",
"id": "59",
"metadata": {
"tags": []
},
Expand All @@ -668,7 +695,7 @@
{
"cell_type": "code",
"execution_count": null,
"id": "58",
"id": "60",
"metadata": {
"tags": []
},
Expand All @@ -680,7 +707,7 @@
{
"cell_type": "code",
"execution_count": null,
"id": "59",
"id": "61",
"metadata": {
"tags": []
},
Expand All @@ -691,11 +718,37 @@
},
{
"cell_type": "markdown",
"id": "60",
"id": "62",
"metadata": {},
"source": [
"We can see from these results that each engine reports nearly identical energies for each individual component."
]
},
{
"cell_type": "markdown",
"id": "63",
"metadata": {},
"source": [
"## Verifying charge neutrality\n",
"\n",
"Earlier, when adding waters to the system, we added three chloride ions to balance the net +3 charge of the protein-ligand complex. Did we actually end up with a charge-neutral system?"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "64",
"metadata": {},
"outputs": [],
"source": [
"import openmm\n",
"\n",
"nbforce = openmm_system.getForces()[0]\n",
"charge = 0 * openmm.unit.elementary_charge\n",
"for i in range(nbforce.getNumParticles()):\n",
" charge += nbforce.getParticleParameters(i)[0]\n",
"charge"
]
}
],
"metadata": {
Expand All @@ -715,7 +768,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.11"
"version": "3.12.9"
}
},
"nbformat": 4,
Expand Down