Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide query to frontend to retrieve path/graph data from neo4j #31

Closed
vincerubinetti opened this issue Apr 10, 2019 · 5 comments · Fixed by #34
Closed

Provide query to frontend to retrieve path/graph data from neo4j #31

vincerubinetti opened this issue Apr 10, 2019 · 5 comments · Fixed by #34

Comments

@vincerubinetti
Copy link

vincerubinetti commented Apr 10, 2019

The front-end will need to query the neo4j database to get the data needed to draw a graph representation of metapaths/paths. Apparently this query is fairly complex and also would be more appropriately generated by the backend.

The backend could either give the frontend the query, or perhaps do the query and return the results to the frontend.

Info on the neo4j data format:
https://github.com/eisman/neo4jd3#neo4j-data-format

@dongbohu
Copy link
Contributor

Can you give me an example neo4j query? Wouldn't it be faster if the frontend queries neo4j and handles the responses directly since the frontend is hosted on github but the search-api backend is hosted on AWS? The extra hop from frontend to search-api backend seems redundant to me.

@dhimmel
Copy link
Collaborator

dhimmel commented Apr 10, 2019

The extra hop from frontend to search-api backend seems redundant to me.

the backend will need to generate the query.

The query will be like (generated mostly by this hetio function):

MATCH path = (n0:Compound)-[:BINDS_CbG]-(n1)-[:PARTICIPATES_GpPW]-(n2)-[:PARTICIPATES_GpPW]-(n3)-[:ASSOCIATES_DaG]-(n4:Disease)
USING JOIN ON n2
WHERE n0.identifier = 'DB01156' // Bupropion
AND n4.identifier = 'DOID:0050742' // nicotine dependency
AND n1 <> n3
WITH
[
size((n0)-[:BINDS_CbG]-()),
size(()-[:BINDS_CbG]-(n1)),
size((n1)-[:PARTICIPATES_GpPW]-()),
size(()-[:PARTICIPATES_GpPW]-(n2)),
size((n2)-[:PARTICIPATES_GpPW]-()),
size(()-[:PARTICIPATES_GpPW]-(n3)),
size((n3)-[:ASSOCIATES_DaG]-()),
size(()-[:ASSOCIATES_DaG]-(n4))
] AS degrees, path
WITH path, reduce(pdp = 1.0, d in degrees| pdp * d ^ -0.5) AS PDP
WITH collect({paths: path, PDPs: PDP}) AS data_maps, sum(PDP) AS DWPC
UNWIND data_maps AS data_map
WITH data_map.paths AS path, data_map.PDPs AS PDP, DWPC
RETURN
path,
substring(reduce(s = '', node IN nodes(path)| s + '–' + node.name), 1) AS str_path,
PDP,
100 * (PDP / DWPC) AS percent_of_DWPC
ORDER BY percent_of_DWPC DESC
LIMIT 10

That is for the CbGpPWpGaD metapath between Buproprion and epilepsy syndrome. Run this query at https://neo4j.het.io/browser/ to see what Neo4j will output.

@vincerubinetti
Copy link
Author

@dhimmel and I are working on Cypher to return data for the path table. Here are two lines to provide a list of neo4j node and relationship ids:

extract(node IN nodes(path) | id(node)) AS node_ids,
extract(rel IN relationships(path) | id(rel)) AS rel_ids,

@dhimmel
Copy link
Collaborator

dhimmel commented Apr 19, 2019

Here is a template query for returning the data for a list of neo4j node ids:

MATCH (node)
WHERE id(node) IN [0, 3, 6]
RETURN
  id(node) AS neo4j_id,
  node.identifier AS identifier,
  head(labels(node)) AS node_label,
  properties(node) AS data

And here is a template for the neo4j rel ids:

MATCH ()-[rel]-()
WHERE id(rel) in [2029636, 1638425]
RETURN
  id(rel) AS neo4j_id,
  type(rel) AS rel_type,
  id(startNode(rel)) AS source_neo4j_id,
  id(endNode(rel)) AS target_neo4j_id,
  properties(rel) AS data

@vincerubinetti I think it's possible this node / rel lookup could happen entirely on the frontend. However, we could also do it on the backend and require a bit less post-processing.

dhimmel added a commit to dhimmel/hetnetpy that referenced this issue Apr 19, 2019
dhimmel added a commit to hetio/hetnetpy that referenced this issue Apr 19, 2019

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
Merge #36
Refs greenelab/connectivity-search-backend#31 (comment)

Also switchs to neo4j cypher list comprehension
@dhimmel
Copy link
Collaborator

dhimmel commented Apr 19, 2019

I'm thinking the paths backend endpoint will return something like:

{
  "query": {
    "source": "DB01156",
    "target": "DOID:0050742",
    "metapath": "CbGiGaD",
    "metapath_id": [
      [
        "Compound",
        "Gene",
        "binds",
        "both"
      ],
      [
        "Gene",
        "Gene",
        "interacts",
        "both"
      ],
      [
        "Gene",
        "Disease",
        "associates",
        "both"
      ]
    ]
  },
  "paths": [
    {
      "metapath": "CbGiGaD",
      "node_ids": [
        43315,
        32786,
        41662,
        1410
      ],
      "rel_ids": [
        1923805,
        1615085,
        309879
      ],
      "PDP": 0.004682929057908469,
      "percent_of_DWPC": 95.77866787048912
    },
    {
      "metapath": "CbGiGaD",
      "node_ids": [
        43315,
        44251,
        28797,
        1410
      ],
      "rel_ids": [
        309501,
        95988,
        159590
      ],
      "PDP": 0.00020639459006779496,
      "percent_of_DWPC": 4.221332129510869
    }
  ]
}

dhimmel added a commit that referenced this issue Apr 22, 2019

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
Merges #33
Refs #31
dhimmel added a commit that referenced this issue Apr 22, 2019

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
Merges #34
Closes #31

Rename querypair endpoint to `query-metapaths`

Update continuous deployment conda env update
Refs #35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants