Skip to content

Commit 4440841

Browse files
Added Graph Attention Network example (#1174)
* add GAT test, add GAT card, update GAT README * add GAT test, add GAT card, update GAT README * update doc build * update tests * revert doc index * update doc index * update doc index * update doc index * revert doc index * revert doc index * revert doc index * revert doc index * Trigger Build * update requirements.txt, remove versions --------- Co-authored-by: Mark Saroufim <marksaroufim@meta.com>
1 parent 741de70 commit 4440841

File tree

6 files changed

+502
-2
lines changed

6 files changed

+502
-2
lines changed

.gitignore

+1
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,7 @@ word_language_model/model.pt
1313
fast_neural_style/saved_models
1414
fast_neural_style/saved_models.zip
1515
gcn/cora/
16+
gat/cora/
1617
docs/build
1718
docs/venv
1819

docs/source/index.rst

+1-1
Original file line numberDiff line numberDiff line change
@@ -176,4 +176,4 @@ experiment with PyTorch.
176176

177177
This example implements the `Semi-Supervised Classification with Graph Convolutional Networks <https://arxiv.org/pdf/1609.02907.pdf>`__ paper on the CORA database.
178178

179-
`GO TO EXAMPLE <https://github.com/pytorch/examples/blob/main/gcn>`__ :opticon:`link-external`
179+
`GO TO EXAMPLE <https://github.com/pytorch/examples/blob/main/gcn>`__ :opticon:`link-external`

gat/README.md

+114
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,114 @@
1+
# Graph Attention Network
2+
3+
This repository contains a PyTorch implementation of the **Graph Attention Networks (GAT)** based on the paper ["Graph Attention Network" by Velickovic et al](https://arxiv.org/abs/1710.10903v3).
4+
5+
The Graph Attention Network is a powerful graph neural network model for learning represtations on graph-structured data, which has shown excellent performance in various tasks such as node classification, link prediction, and graph classification.
6+
7+
8+
## Overview
9+
The Graph Attention Network (GAT) is a graph neural network architecture designed specifically for handling graph-structured data. It leverages multi-head attention mechanism to capture the information of neighboring nodes in an attentive manner to learn represtations for each node. This attention mechanism allows the model to focus on relevant nodes and adaptively weight their contributions during message passing.
10+
11+
Check out the following resources for more ino on GATs:
12+
- [Blog post by the main auther, Petar Velickovic](https://petar-v.com/GAT/)
13+
- [Main paper](https://doi.org/10.48550/arXiv.1710.10903)
14+
15+
This repository provides a clean and short implementation of the official GAT model using PyTorch. The code is well-documented and easy to understand, making it a valuable resource for researchers and practitioners interested in graph deep learning.
16+
17+
18+
## Key Features
19+
20+
- **GAT Model**: Implementation of the Graph Attention Network model with multi-head attention based on on the paper "Graph Attention Network" by Velickovic et al.
21+
- **Graph Attention Layers**: Implementation of graph convolutional layers that aggregate information from neighboring nodes using a self-attention mechanisms to learn node importance weights.
22+
- **Training and Evaluation**: Code for training GAT models on graph-structured data and evaluating their performance on node classification tasks on the *Cora* benchmark dataset.
23+
24+
---
25+
26+
# Requirements
27+
- Python 3.7 or higher
28+
- PyTorch 2.0 or higher
29+
- Requests 2.31 or higher
30+
- NumPy 1.24 or higher
31+
32+
33+
34+
# Dataset
35+
The implementation includes support for the Cora dataset, a standard benchmark dataset for graph-based machine learning tasks. The Cora dataset consists of scientific publications, where nodes represent papers and edges represent citation relationships. Each paper is associated with a binary label indicating one of seven classes. The dataset is downloaded, preprocessed and ready to use.
36+
37+
# Model Architecture
38+
The official architecture (used in this project) proposed in the paper "Graph Attention Network" by Velickovic et al. consists of two graph attention layers which incorporates the multi-head attention mechanisms during its message trasformation and aggregation. Each graph attention layer applies a shared self-attention mechanism to every node in the graph, allowing them to learn different representations based on the importance of their neighbors.
39+
40+
In terms of activation functions, the GAT model employs both the **Exponential Linear Unit (ELU)** and the **Leaky Rectified Linear Unit (LeakyReLU)** activations, which introduce non-linearity to the model. ELU is used as the activation function for the **hidden layers**, while LeakyReLU is applied to the **attention coefficients** to ensure non-zero gradients for negative values.
41+
42+
Following the official implementation, the first GAT layer consists of **K = 8 attention heads** computing **F' = 8 features** each (for a **total of 64 features**) followed by an exponential linear unit (ELU) activation on the layer outputs. The second GAT layer is used for classification: a **single attention head** that computes C features (where C is the number of classes), followed by a softmax activation for probablisitic outputs. (we use log-softmax instead for computational convenience with using NLLLoss)
43+
44+
*Note that due to being an educational example, this implementation uses the full dense form of the adjacency matrix of the graph, and not the sparse form of the matrix. Thus all the operations in the model implemeation is done in a non-sparse from. This will not affect the model's performance accuracy-wise. However an sparse-friendly implementation will help with the efficiency in the use of resources, storage, and speed.*
45+
46+
47+
# Usage
48+
Training and evaluating the GAT model on the Cora dataset can be done through running the the `main.py` script as follows:
49+
50+
1. Clone the PyTorch examples repository:
51+
52+
```
53+
git clone https://github.com/pytorch/examples.git
54+
cd examples/gat
55+
```
56+
57+
2. Install the required dependencies:
58+
59+
```
60+
pip install -r requirements.txt
61+
```
62+
63+
3. Train the GAT model by running the the `main.py` script as follows:: (Example using the default parameters)
64+
65+
```bash
66+
python main.py --epochs 300 --lr 0.005 --l2 5e-4 --dropout-p 0.6 --num-heads 8 --hidden-dim 64 --val-every 20
67+
```
68+
69+
In more detail, the `main.py` script recieves following arguments:
70+
```
71+
usage: main.py [-h] [--epochs EPOCHS] [--lr LR] [--l2 L2] [--dropout-p DROPOUT_P] [--hidden-dim HIDDEN_DIM] [--num-heads NUM_HEADS] [--concat-heads] [--val-every VAL_EVERY]
72+
[--no-cuda] [--no-mps] [--dry-run] [--seed S]
73+
74+
PyTorch Graph Attention Network
75+
76+
options:
77+
-h, --help show this help message and exit
78+
--epochs EPOCHS number of epochs to train (default: 300)
79+
--lr LR learning rate (default: 0.005)
80+
--l2 L2 weight decay (default: 6e-4)
81+
--dropout-p DROPOUT_P
82+
dropout probability (default: 0.6)
83+
--hidden-dim HIDDEN_DIM
84+
dimension of the hidden representation (default: 64)
85+
--num-heads NUM_HEADS
86+
number of the attention heads (default: 4)
87+
--concat-heads wether to concatinate attention heads, or average over them (default: False)
88+
--val-every VAL_EVERY
89+
epochs to wait for print training and validation evaluation (default: 20)
90+
--no-cuda disables CUDA training
91+
--no-mps disables macOS GPU training
92+
--dry-run quickly check a single pass
93+
--seed S random seed (default: 13)
94+
```
95+
96+
97+
98+
# Results
99+
After training for **300 epochs** with default hyperparameters on random train/val/test data splits, the GAT model achieves around **%81.25** classification accuracy on the test split. This result is comparable to the performance reported in the original paper. However, the results can vary due to the randomness of the train/val/test split.
100+
101+
# Reference
102+
103+
```
104+
@article{
105+
velickovic2018graph,
106+
title="{Graph Attention Networks}",
107+
author={Veli{\v{c}}kovi{\'{c}}, Petar and Cucurull, Guillem and Casanova, Arantxa and Romero, Adriana and Li{\`{o}}, Pietro and Bengio, Yoshua},
108+
journal={International Conference on Learning Representations},
109+
year={2018},
110+
url={https://openreview.net/forum?id=rJXMpikCZ},
111+
}
112+
```
113+
- Paper on arxiv: [arXiv:1710.10903v3](https://doi.org/10.48550/arXiv.1710.10903)
114+
- Original paper repository: [https://github.com/PetarV-/GAT](https://github.com/PetarV-/GAT)

0 commit comments

Comments
 (0)