This is an extensive and continuously updated compilation of self-supervised GFM literature categorized by the knowledge-based taxonomy, proposed by our paper 📄A Survey on Self-Supervised Graph Foundation Models: Knowledge-Based Perspective. Here every pretext of each paper is listed and briefly explained. You can find all pretexts and their corresponding papers with detailed metadata below, including additional pretexts and literature not listed in our paper.
A kind reminder: to search for a certain paper, type the title or the abbreviation of the proposed method (recommended) into the browser search bar (Ctrl + F). Some papers fall under multiple sections.
- [8 Feb 2025]: Updated papers in ICLR'25, WWW'25 and more.
- [5 Dec 2024]: Updated papers in WSDM'25, LoG'24 and more.
- [4 Oct 2024]: Updated papers in CIKM'24 and NeurIPS'24.
- [2 Sept 2024]: Updated papers in IJCAI'24, SIGIR'24, and KDD'24.
- [1 Aug 2024]: We have a huge update thanks to the joining of Dr. Yixin Su! Please check the new version of our survey here!🔥
- [1 Aug 2024]: Updated papers in ICDE'24 and MM'24.
- [24 Mar 2024]: Our survey has uploaded to arXiv!
-
Microscopic pre-training
-
Mesoscopic pre-training
-
Macroscopic pre-training
Note: 🕸️ graph-related; 🤖 LLM-related; 📚 survey; 📊 benchmark; 🔬 empirical study
Node features
Feature prediction
- Feature prediction: to predict the original node features by decoding low-dimensional representations
- Feature denoising: to add (generally continuous, e.g. isotropic Gaussian) noises to the original features and try to reconstruct them
- Masked feature prediction: a special, discrete case of feature denoising, which predicts the original features of masked nodes by representations of unmasked ones. It is "autoregressive" if the predicted nodes are generated one-by-one
- Replaced node prediction: to replace some nodes with different ones and learn to find and reconstruct the replaced nodes
Paper | Venue | Pretext | Downstream | Code |
---|---|---|---|---|
MGAE: Marginalized Graph Autoencoder for Graph Clustering | CIKM'17 | Feature prediction | Graph partitioning | link |
Symmetric Graph Convolutional Autoencoder for Unsupervised Graph Representation Learning (GALA) | ICCV'19 | Feature prediction | Node clustering; link prediction; image clustering | link |
Strategies for Pre-training Graph Neural Networks (AttrMask) | ICLR'20 | Masked feature prediction | Graph classification; biological function prediction | link |
Graph Representation Learning via Graphical Mutual Information Maximization (GMI) | WWW'20 | Feature prediction (JS) | Node classification; link prediction | link |
When Does Self-Supervision Help Graph Convolutional Networks? (GraphComp) | ICML'20 | Masked feature prediction | Node classification | link |
GPT-GNN: Generative Pre-Training of Graph Neural Networks | KDD'20 | Masked feature prediction (autoregressive) | Node classification; (heterogeneous) link prediction; edge regression | link |
Graph Attention Auto-Encoders (GATE) | ICTAI'20 | Feature prediction | Node classification | link |
Graph-Bert: Only Attention is Needed for Learning Graph Representations | arXiv:2001 | Feature prediction | Node classification; node clustering | link |
Self-supervised Learning on Graphs: Deep Insights and New Direction (AttributeMask) | arXiv:2006 | Masked feature prediction | Node classification | link |
SLAPS: Self-Supervision Improves Structure Learning for Graph Neural Networks | NeurIPS'21 | Masked feature prediction | Node classification; image classification | link |
Motif-based Graph Self-Supervised Learning for Molecular Property Prediction (MGSSL) | NeurIPS'21 | Masked feature prediction | Graph classification | link |
Multi-Scale Variational Graph AutoEncoder for Link Prediction (MSVGAE) | WSDM'22 | Feature prediction | Link prediction | -- |
Self-Supervised Representation Learning via Latent Graph Prediction (LaGraph) | ICML'22 | Masked feature prediction | Node classification; graph classification | link |
GraphMAE: Self-Supervised Masked Graph Autoencoders | KDD'22 | Masked feature prediction | Node classification; graph classification | link |
Interpretable Node Representation with Attribute Decoding (NORAD) | TMLR'22 | Feature prediction | Node classification; node clustering; link prediction | -- |
Graph Masked Autoencoders with Transformers (GMAE) | arXiv:2202 | Masked feature prediction | Node classification; graph classification | link |
Wiener Graph Deconvolutional Network Improves Graph Self-Supervised Learning (WGDN) | AAAI'23 | Feature prediction | Node classification; graph classification | link |
Heterogeneous Graph Masked Autoencoders (HGMAE) | AAAI'23 | Feature prediction; masked feature prediction | (Heterogeneous) node classification; node clustering | link |
Mole-BERT: Rethinking Pre-training Graph Neural Networks for Molecules | ICLR'23 | Masked feature prediction | Graph classification; graph regression | link |
Learning Fair Graph Representations via Automated Data Augmentations (Graphair) | ICLR'23 | Masked feature prediction | Node classification | link |
GraphMAE2: A Decoding-Enhanced Masked Self-Supervised Graph Learner | WWW'23 | Masked feature prediction | Node classification | link |
SeeGera: Self-supervised Semi-implicit Graph Variational Auto-encoders with Masking | WWW'23 | Masked feature prediction | Node classification; link prediction; attribute prediction | link |
Patton: Language Model Pretraining on Text-Rich Networks | ACL'23 | Masked feature prediction | Node classification; link prediction; etc | link |
Directional Diffusion Models for Graph Representation Learning (DDM) | NeurIPS'23 | Feature denoising | Node classification; graph classification | link |
DiP-GNN: Discriminative Pre-Training of Graph Neural Networks | NeurIPS Workshop (GLFrontiers)'23 | Masked feature prediction | Node classification; link prediction | -- |
Towards Effective and Robust Graph Contrastive Learning With Graph Autoencoding (AEGCL) | TKDE'23 | Feature prediction | Node classification; node clustering; link prediction | link |
RARE: Robust Masked Graph Autoencoder | TKDE'23 | Masked feature prediction | Node classification; graph classification; image classification | link |
Homophily-Enhanced Self-Supervision for Graph Structure Learning: Insights and Directions (HES-GSL) | TNNLS'23 | Feature denoising | Node classification | link |
SGL-PT: A Strong Graph Learner with Graph Prompt Tuning | arXiv:2302 | Masked feature prediction | Node classification; graph classification | -- |
Incomplete Graph Learning via Attribute-Structure Decoupled Variational Auto-Encoder (ASD-VAE) | WSDM'24 | Feature prediction | Node classification; node attribute completion | link |
Deep Contrastive Graph Learning with Clustering-Oriented Guidance (DCGL) | AAAI'24 | Feature prediction | Node clustering | link |
Rethinking Graph Masked Autoencoders through Alignment and Uniformity (AUG-MAE) | AAAI'24 | Masked feature prediction | Node classification; graph classification | link |
Empowering Dual-Level Graph Self-Supervised Pretraining with Motif Discovery (DGPM) | AAAI'24 | Masked feature prediction | Graph classification | link |
Generative and Contrastive Paradigms Are Complementary for Graph Self-Supervised Learning (GCMAE1) | ICDE'24 | Masked feature prediction | Node classification; node clustering; graph classification; link prediction | link |
Masked Graph Modeling with Multi-View Contrast (GCMAE2) | ICDE'24 | Masked feature prediction | Node classification; graph classification; link prediction | link |
DiscoGNN: A Sample-Efficient Framework for Self-Supervised Graph Representation Learning | ICDE'24 | Replaced node prediction | Graph classification; similarity search | link |
IdmGAE: Importance-Inspired Dynamic Masking for Graph Autoencoders | SIGIR'24 (short) | Masked feature prediction | Node classification | -- |
Where to Mask: Structure-Guided Masking for Graph Masked Autoencoders (StructMAE) | IJCAI'24 | Masked feature prediction | Graph classification | link |
A Pure Transformer Pretraining Framework on Text-attributed Graphs (GSPT) | LoG'24 | Masked feature prediction1 | Node classification; link prediction | link (unavailable) |
HC-GAE: The Hierarchical Cluster-based Graph Auto-Encoder for Graph Representation Learning | NeurIPS'24 | Feature prediction | Node classification; graph classification | -- |
Redundancy Is Not What You Need: An Embedding Fusion Graph Auto-Encoder for Self-Supervised Graph Representation Learning (EFGAE) | TNNLS'24 | Feature prediction | Node classification | -- |
UniGraph: Learning a Unified Cross-Domain Foundation Model for Text-Attributed Graphs | arXiv:2402 | Masked feature prediction | Node classification; graph classification; edge classification | link |
Exploring Task Unification in Graph Representation Learning via Generative Approach (GA2E) | arXiv:2403 | Masked feature prediction | Node classification; graph classification; link prediction | -- |
Training MLPs on Graphs without Supervision (SimMLP) | WSDM'25 | Feature prediction | Node classification; graph classification; link prediction | link |
UniGraph2: Learning a Unified Embedding Space to Bind Multimodal Graphs | WWW'25 | Masked feature prediction | Node classification; link prediction; edge classification | link (unavailable) |
Hierarchical Vector Quantized Graph Autoencoder with Annealing-Based Code Selection (HQA-GAE) | WWW'25 | Masked feature prediction | Node classification; link prediction | link |
Discrimination (contrastive)
- Node instance discrimination: to minimize/maximize the distance between pairs of positive/negative node representations. Jenson-Shannon (JS), InfoNCE (incl. NT-Xent), Triplet margin, and Bootstrapping are all estimators of mutual information (MI) between nodes. Other contrastive losses:
- MSE stands for the mean squared error (
$\ell_2$ loss) - SP stands for the population spectral contrastive loss
- BPR stands for Bayesian Personalized Ranking loss, mostly used in recommendation
- Other stands for other, literally not belonging to any of the above
- MSE stands for the mean squared error (
- Dimension discrimination: to minimize/maximize the mutual information (MI) between pairs of positive/negative representation dimensions. Could be either intra-sample or inter-sample
Paper | Venue | Pretext | Downstream | Code |
---|---|---|---|---|
Deep Graph Contrastive Representation Learning (GRACE) | ICML Workshop (GRL+)'20 | Node instance discrimination (InfoNCE) | Node classification | link |
GraphTER: Unsupervised Learning of Graph Transformation Equivariant Representations via Auto-Encoding Node-wise Transformations | CVPR'20 | Node instance discrimination (MSE) | Node classification (point cloud segmentation); graph (point cloud) classification | link |
Contrastive and Generative Graph Convolutional Networks for Graph-based Semi-Supervised Learning (CG3) | AAAI'21 | Node instance discrimination (InfoNCE) | Node classification | link |
Graph Contrastive Learning with Adaptive Augmentation (GCA) | WWW'21 | Node instance discrimination (InfoNCE) | Node classification | link |
SelfGNN: Self-supervised Graph Neural Networks without Explicit Negative Sampling | WWW Workshop (SSL)'21 | Node instance discrimination (Bootstrapping) | Node classification | link |
Self-supervised Graph Learning for Recommendation (SGL) | SIGIR'21 | Node instance discrimination (InfoNCE, BPR) | Recommendation | link |
Multi-Scale Contrastive Siamese Networks for Self-Supervised Graph Representation Learning (MERIT) | IJCAI'21 | Node instance discrimination (InfoNCE) | Node classification | link |
Pre-training on Large-Scale Heterogeneous Graph (PT-HGNN) | KDD'21 | Node instance discrimination (InfoNCE) | (Heterogeneous) node classification; link prediction | link |
Self-supervised Heterogeneous Graph Neural Network with Co-contrastive Learning (HeCo); Hierarchical Contrastive Learning Enhanced Heterogeneous Graph Neural Network (HeCo++) | KDD'21; TKDE'23 | Node instance discrimination (InfoNCE) | (Heterogeneous) node classification; node clustering | link |
InfoGCL: Information-Aware Graph Contrastive Learning | NeurIPS'21 | Node instance discrimination (Bootstrapping) | Node classification; graph classification | -- |
From Canonical Correlation Analysis to Self-supervised Graph Neural Networks (CCA-SSG) | NeurIPS'21 | Node instance discrimination (MSE); dimension discrimination | Node classification | link |
Self-Supervised GNN that Jointly Learns to Augment (GraphSurgeon) | NeurIPS Workshop (SSL)'21 | Node instance discrimination (MSE); dimension discrimination | Node classification | link |
Simple Unsupervised Graph Representation Learning (SUGRL) | AAAI'22 | Node instance discrimination (Triplet margin) | Node classification | link |
Large-Scale Representation Learning on Graphs via Bootstrapping (BGRL) | ICLR'22 | Node instance discrimination (Bootstrapping) | Node classification | link |
VICReg: Variance-Invariance-Covariance Regularization for Self-Supervised Learning | ICLR'22 | Node instance discrimination (MSE); dimension discrimination | Node classification | link |
Adversarial Graph Contrastive Learning with Information Regularization (ARIEL) | WWW'22 | Node instance discrimination (InfoNCE) | Node classification; graph classification | link |
Are Graph Augmentations Necessary? Simple Graph Contrastive Learning for Recommendation (SimGCL); XSimGCL: Towards Extremely Simple Graph Contrastive Learning for Recommendation | SIGIR'22; TKDE'23 | Node instance discrimination (InfoNCE, BPR) | Recommendation | link |
Self-Supervised Representation Learning via Latent Graph Prediction (LaGraph) | ICML'22 | Node instance discrimination (MSE) | Node classification; graph classification | link |
ProGCL: Rethinking Hard Negative Mining in Graph Contrastive Learning | ICML'22 | Node instance discrimination (InfoNCE) | Node classification | link |
COSTA: Covariance-Preserving Feature Augmentation for Graph Contrastive Learning | KDD'22 | Node instance discrimination (InfoNCE) | Node classification | link |
Relational Self-Supervised Learning on Graphs (RGRL) | CIKM'22 | Node instance discrimination (Bootstrapping) | Node classification; link prediction | link |
Revisiting Graph Contrastive Learning from the Perspective of Graph Spectrum (SpCo) | NeurIPS'22 | Node instance discrimination (InfoNCE) | Node classification | link |
Contrastive Graph Structure Learning via Information Bottleneck for Recommendation (CGI) | NeurIPS'22 | Node instance discrimination (InfoNCE) | Recommendation | link |
Uncovering the Structural Fairness in Graph Contrastive Learning (GRADE) | NeurIPS'22 | Node instance discrimination (InfoNCE) | Node classification | link |
Co-Modality Graph Contrastive Learning for Imbalanced Node Classification (CM-GCL) | NeurIPS'22 | Node instance discrimination (InfoNCE) | Node classification (imbalanced) | link |
Graph Barlow Twins: A Self-supervised Representation Learning Framework for Graphs (G-BT) | KBS'22 | Dimension discrimination | Node classification | link |
Towards Graph Self-Supervised Learning with Contrastive Adjusted Zooming (G-Zoom) | TNNLS'22 | Node instance discrimination (InfoNCE) | Node classification | -- |
GRLC: Graph Representation Learning With Constraints | TNNLS'22 | Node instance discrimination (Triplet margin) | Node classification; node clustering; link prediction | link |
Neural Eigenfunctions Are Structured Representation Learners (NeuralEF) | arXiv:2210 | Dimension discrimination | Node classification; computer vision (object detection, instance segmentation, etc) | link |
MA-GCL: Model Augmentation Tricks for Graph Contrastive Learning | AAAI'23 | Node instance discrimination (InfoNCE) | Node classification | link |
ImGCL: Revisiting Graph Contrastive Learning on Imbalanced Node Classification | AAAI'23 | Node instance discrimination (InfoNCE) | Node classification (imbalanced) | -- |
Spectral Feature Augmentation for Graph Contrastive Learning and Beyond (SFA) | AAAI'23 | Node instance discrimination (Other) | Node classification; node clustering; graph classification; image classification | link |
Beyond Smoothing: Unsupervised Graph Representation Learning with Edge Heterophily Discriminating (GREET) | AAAI'23 | Node instance discrimination (Triplet margin) | Node classification | link |
Link Prediction with Non-Contrastive Learning (T-BGRL) | ICLR'23 | Node instance discrimination (Bootstrapping) | Link prediction | link |
LightGCL: Simple Yet Effective Graph Contrastive Learning for Recommendation | ICLR'23 | Node instance discrimination (InfoNCE) | Recommendation | link |
Learning Fair Graph Representations via Automated Data Augmentations (Graphair) | ICLR'23 | Node instance discrimination (InfoNCE) | Node classification | link |
GraphMAE2: A Decoding-Enhanced Masked Self-Supervised Graph Learner | WWW'23 | Node instance discrimination (MSE) | Node classification | link |
Graph Self-supervised Learning with Augmentation-aware Contrastive Learning (ABGML) | WWW'23 | Node instance discrimination (Bootstrapping) | Node classification; node clustering; similarity search | link |
Randomized Schur Complement Views for Graph Contrastive Learning (rLap) | ICML'23 | Node instance discrimination (InfoNCE, Bootstrapping) | Node classification | link |
Graph Contrastive Learning with Generative Adversarial Network (GACN) | KDD'23 | Node instance discrimination (InfoNCE, BPR) | Node classification; link prediction | -- |
Heterformer: Transformer-based Deep Node Representation Learning on Heterogeneous Text-Rich Networks | KDD'23 | Node instance discrimination (InfoNCE) | (Heterogeneous) node classification; node clustering; link prediction | link |
Exploring Universal Principles for Graph Contrastive Learning: A Statistical Perspective | MM'23 | Dimension discrimination | Node classification | -- |
GiGaMAE: Generalizable Graph Masked Autoencoder via Collaborative Latent Space Reconstruction | CIKM'23 | Node instance discrimination (InfoNCE) | Node classification; node clustering; link prediction | link |
GRENADE: Graph-Centric Language Model for Self-Supervised Representation Learning on Text-Attributed Graphs | EMNLP Findings'23 | Node instance discrimination (InfoNCE, KL) | Node classification; node clustering; link prediction | link |
Provable Training for Graph Contrastive Learning (POT) | NeurIPS'23 | Node instance discrimination (InfoNCE) | Node classification | link |
Graph Contrastive Learning with Stable and Scalable Spectral Encoding (Sp2GCL) | NeurIPS'23 | Node instance discrimination (InfoNCE) | Node classification; graph classification; graph regression | link |
Certifiably Robust Graph Contrastive Learning (RES) | NeurIPS'23 | Node instance discrimination (InfoNCE) | Node classification | link |
RARE: Robust Masked Graph Autoencoder | TKDE'23 | Node instance discrimination (MSE) | Node classification; graph classification; computer vision (image classification) | link |
Multi-Scale Self-Supervised Graph Contrastive Learning With Injective Node Augmentation (MS-CIA) | TKDE'23 | Node instance discrimination (InfoNCE) | Node classification | -- |
Boosting Graph Contrastive Learning via Adaptive Sampling (AdaS) | TNNLS'23 | Node instance discrimination (InfoNCE) | Node classification | -- |
Affinity Uncertainty-Based Hard Negative Mining in Graph Contrastive Learning (AUGCL) | TNNLS'23 | Node instance discrimination (InfoNCE) | Node classification | link |
Unsupervised Structure-Adaptive Graph Contrastive Learning | TNNLS'23 | Node instance discrimination (InfoNCE) | Node classification; node clustering; graph classification | -- |
Hierarchically Contrastive Hard Sample Mining for Graph Self-Supervised Pretraining (HCHSM) | TNNLS'23 | Node instance discrimination (JS) | Node classification; node clustering | link |
Dual Contrastive Learning Network for Graph Clustering (DCLN) | TNNLS'23 | Dimension discrimination | Node classification; node clustering | link |
Graph Contrastive Learning With Adaptive Proximity-Based Graph Augmentation (PA-GCL) | TNNLS'23 | Dimension discrimination | Node classification; link prediction | link |
Augmentation-Free Graph Contrastive Learning of Invariant-Discriminative Representations (iGCL) | TNNLS'23 | Node instance discrimination (MSE); dimension discrimination | Node classification | link |
Single-Pass Contrastive Learning Can Work for Both Homophilic and Heterophilic Graph (SP-GCL) | TMLR'23 | Node instance discrimination (SP) | Node classification | link |
Calibrating and Improving Graph Contrastive Learning (Contrast-Reg) | TMLR'23 | Node instance discrimination (InfoNCE) | Node classification; node clustering; link prediction | link |
Oversmoothing: A Nightmare for Graph Contrastive Learning? (BlockGCL) | arXiv:2306 | Dimension discrimination | Node classification | link |
Rethinking and Simplifying Bootstrapped Graph Latents (SGCL2) | WSDM'24 | Node instance discrimination (Bootstrapping) | Node classification | link |
Towards Alignment-Uniformity Aware Representation in Graph Contrastive Learning (AUAR) | WSDM'24 | Node instance discrimination (InfoNCE) | Node classification; node clustering | -- |
ReGCL: Rethinking Message Passing in Graph Contrastive Learning | AAAI'24 | Node instance discrimination (InfoNCE) | Node classification | link |
A New Mechanism for Eliminating Implicit Conflict in Graph Contrastive Learning (PiGCL) | AAAI'24 | Node instance discrimination (InfoNCE) | Node classification; node clustering | link |
ASWT-SGNN: Adaptive Spectral Wavelet Transform-Based Self-Supervised Graph Neural Network | AAAI'24 | Node instance discrimination (InfoNCE) | Node classification; graph classification | -- |
Graph Contrastive Invariant Learning from the Causal Perspective (GCIL) | AAAI'24 | Dimension discrimination | Node classification | link |
A Graph is Worth 1-bit Spikes: When Graph Contrastive Learning Meets Spiking Neural Networks (SpikeGCL) | ICLR'24 | Node instance discrimination (Triplet margin) | Node classification | link |
Self-supervised Heterogeneous Graph Learning: a Homophily and Heterogeneity View (HERO) | ICLR'24 | Node instance discrimination (MSE) | (Heterogeneous) node classification; similarity search | link |
Generative and Contrastive Paradigms Are Complementary for Graph Self-Supervised Learning (GCMAE1) | ICDE'24 | Node instance discrimination (InfoNCE) | Node classification; node clustering; graph classification; link prediction | link |
GradGCL: Gradient Graph Contrastive Learning | ICDE'24 | Node instance discrimination (InfoNCE) | Node classification; graph classification | link |
Incorporating Dynamic Temperature Estimation into Contrastive Learning on Graphs (GLATE) | ICDE'24 | Node instance discrimination (InfoNCE) | Node classification; node clustering; graph classification; link prediction | link |
Graph Augmentation for Recommendation (GraphAug) | ICDE'24 | Node instance discrimination (InfoNCE, BPR) | Recommendation | link |
Graph Contrastive Learning with Cohesive Subgraph Awareness (CTAug) | WWW'24 | Node instance discrimination (InfoNCE) | Node classification | link |
Towards Expansive and Adaptive Hard Negative Mining: Graph Contrastive Learning via Subspace Preserving (GRAPE) | WWW'24 | Node instance discrimination (InfoNCE) | Node classification; node clustering | link |
MARIO: Model Agnostic Recipe for Improving OOD Generalization of Graph Contrastive Learning | WWW'24 | Node instance discrimination (InfoNCE) | Node classification; graph classification | link |
Graph Contrastive Learning via Interventional View Generation (GCL-IVG) | WWW'24 | Node instance discrimination (InfoNCE) | Node classification; node clustering | -- |
Graph Contrastive Learning with Kernel Dependence Maximization for Social Recommendation (CL-KDM) | WWW'24 | Node instance discrimination (InfoNCE, BPR) | Recommendation | -- |
High-Frequency-aware Hierarchical Contrastive Selective Coding for Representation Learning on Text-attributed Graphs (HASH-CODE) | WWW'24 | Node instance discrimination (SP) | Node classification; link prediction | -- |
S3GCL: Spectral, Swift, Spatial Graph Contrastive Learning | ICML'24 | Node instance discrimination (InfoNCE) | Node classification | link |
Geometric View of Soft Decorrelation in Self-Supervised Learning (LogDet) | KDD'24 | Dimension discrimination | Node classification | -- |
Reserving-Masking-Reconstruction Model for Self-Supervised Heterogeneous Graph Representation (RMR) | KDD'24 | Node instance discrimination (Bootstrapping) | (Heterogeneous) node classification | link |
Towards Robust Recommendation via Decision Boundary-aware Graph Contrastive Learning (RGCL2) | KDD'24 | Node instance discrimination (InfoNCE, BPR) | Recommendation | link |
Gaussian Mutual Information Maximization for Efficient Graph Self-Supervised Learning: Bridging Contrastive-based to Decorrelation-based (GMIM) | MM'24 | Dimension discrimination | Node classification | -- |
Exploitation of a Latent Mechanism in Graph Contrastive Learning: Representation Scattering (SGRL) | NeurIPS'24 | Node instance discrimination (Bootstrapping) | Node classification; node clustering | link |
Leveraging Contrastive Learning for Enhanced Node Representations in Tokenized Graph Transformers (GCFormer) | NeurIPS'24 | Node instance discrimination (InfoNCE) | Node classification | link (unavailable) |
Unified Graph Augmentations for Generalized Contrastive Learning on Graphs (GOUDA) | NeurIPS'24 | Node instance discrimination (InfoNCE); dimension discrimination | Node classification; node clustering; graph classification | link |
Heterogeneous Graph Contrastive Learning with Meta-path Contexts and Adaptively Weighted Negative Samples (MEOW) | TKDE'24 | Node instance discrimination (InfoNCE) | (Heterogeneous) node classification; node clustering | link |
Redundancy Is Not What You Need: An Embedding Fusion Graph Auto-Encoder for Self-Supervised Graph Representation Learning (EFGAE) | TNNLS'24 | Dimension discrimination | Node classification | -- |
Multilevel Contrastive Graph Masked Autoencoders for Unsupervised Graph-Structure Learning (MCGMAE) | TNNLS'24 | Node instance discrimination (InfoNCE) | Node classification | -- |
UniGraph: Learning a Unified Cross-Domain Foundation Model for Text-Attributed Graphs | arXiv:2402 | Node instance discrimination (Bootstrapping) | Node classification; graph classification; edge classification | link |
Towards Graph Foundation Models: Learning Generalities Across Graphs via Task-Trees (GIT) | arXiv:2412 | Node instance discrimination (Bootstrapping) | Node classification; graph classification; link prediction; edge classification | link (unavailable) |
Training MLPs on Graphs without Supervision (SimMLP) | WSDM'25 | Node instance discrimination (MSE) | Node classification; graph classification; link prediction | link |
UniGLM: Training One Unified Language Model for Text-Attributed Graph Embedding | WSDM'25 | Node instance discrimination (InfoNCE) | Node classification; link prediction | link |
Graph Structure Refinement with Energy-based Contrastive Learning (ECL-GSR) | AAAI'25 | Node instance discrimination (InfoNCE) | Node classification; graph classification | -- |
WhyDoes Dropping Edges Usually Outperform Adding Edges in Graph Contrastive Learning? (EPAGCL) | AAAI'25 | Node instance discrimination (InfoNCE) | Node classification | link |
Centrality-guided Pre-training for Graph (CenPre) | ICLR'25 | Node instance discrimination (InfoNCE, MSE) | Node classification; graph classification; link prediction | -- |
Str-GCL: Structural Commonsense Driven Graph Contrastive Learning | WWW'25 | Node instance discrimination (InfoNCE); dimension discrimination | Node classification; node clustering | -- |
Balancing Graph Embedding Smoothness in Self-supervised Learning via Information-Theoretic Decomposition (BSG) | WWW'25 | Node instance discrimination (MSE) | Node classification; link prediction | link (private) |
Node properties
- Property prediction: a regression task to predict the property of a node (e.g. degree)
- Centrality ranking: to estimate whether the centrality score of a node is greater/lower than that of another node
- Node order matching: to match the output node order with the input order
Paper | Venue | Pretext | Downstream | Code |
---|---|---|---|---|
Unsupervised Pre-training of Graph Convolutional Networks (ScoreRank) | ICLR Workshop (RLGM)'19 | Centrality ranking | Node classification | -- |
Self-supervised Learning on Graphs: Deep Insights and New Direction (NodeProperty) | arXiv:2006 | Property prediction (degree, clustering coefficient, etc.) | Node classification | link |
Permutation-Invariant Variational Autoencoder for Graph-Level Representation Learning (PIGAE) | NeurIPS'21 | Node order matching | Graph classification | link |
Graph Auto-Encoder Via Neighborhood Wasserstein Reconstruction (NWR-GAE) | ICLR'22 | Property prediction (degree) | Node classification; structural role identification | link |
What's Behind the Mask: Understanding Masked Graph Modeling for Graph Autoencoders (MaskGAE) | KDD'23 | Property prediction (degree) | Node classification; link prediction | link |
Centrality-guided Pre-training for Graph (CenPre) | ICLR'25 | Property prediction (degree) | Node classification; graph classification; link prediction | -- |
Links
- Link prediction: a generally binary classification task that predicts if two nodes are connected by a link. For heterogeneous graphs, link prediction is based on meta-paths. For hypergraphs, link prediction searchs for the missing node given other nodes in a hyperedge
- Link denoising: to add (generally continuous) noises to the original edge set and try to reconstruct it
- Masked link prediction: to predict the masked links by node representations propagated on the unmasked graph. It is "autoregressive" if the predicted links are generated one-by-one
- (Masked) edge feature prediction: to predict the original (masked) edge features by node representations
- Replaced edge feature prediction: to replace some edge properties with different ones and learn to find and reconstruct the replaced edges, similar to replaced node prediction
Paper | Venue | Pretext | Downstream | Code |
---|---|---|---|---|
Variational Graph Auto-Encoders (GAE, VGAE) | NIPS Workshop (BDL)'16 | Link prediction | Link prediction | link |
Adversarially Regularized Graph Autoencoder for Graph Embedding (ARGA, ARVGA) | IJCAI'18 | Link prediction | Link prediction; node clustering | link |
Unsupervised Pre-training of Graph Convolutional Networks (DenoisingRecon) | ICLR Workshop (RLGM)'19 | Masked link prediction | Node classification | -- |
Graphite: Iterative Generative Modeling of Graphs | ICML'19 | Link prediction | Node classification; link prediction | link |
Semi-Implicit Graph Variational Auto-Encoders (SIG-VAE) | NeurIPS'19 | Link prediction | Node classification; link prediction; node clustering; graph generation | link |
Strategies for Pre-training Graph Neural Networks (AttrMask) | ICLR'20 | Masked edge feature prediction | Graph classification; biological function prediction | link |
GPT-GNN: Generative Pre-Training of Graph Neural Networks | KDD'20 | Masked link prediction (autoregressive) | Node classification; link prediction; edge classification | link |
Self-supervised Auxiliary Learning with Meta-paths for Heterogeneous Graphs (SELAR) | NeurIPS'20 | Link prediction | (Heterogeneous) node classification; link prediction | link |
Self-supervised Learning on Graphs: Deep Insights and New Direction (EdgeMask) | arXiv:2006 | Masked link prediction | Node classification | link |
Contrastive and Generative Graph Convolutional Networks for Graph-based Semi-Supervised Learning (CG3) | AAAI'21 | Link prediction | Node classification | link |
How to Find Your Friendly Neighborhood: Graph Attention Design with Self-Supervision (SuperGAT) | ICLR'21 | Link prediction | Node classification; link prediction | link |
Permutation-Invariant Variational Autoencoder for Graph-Level Representation Learning (PIGAE) | NeurIPS'21 | Link prediction; edge feature prediction | Graph classification | link |
Motif-based Graph Self-Supervised Learning for Molecular Property Prediction (MGSSL) | NeurIPS'21 | Masked edge feature prediction | Graph classification | link |
GraphFormers: GNN-nested Transformers for Representation Learning on Textual Graph | NeurIPS'21 | Link prediction | Link prediction | link |
Self-Supervised Graph Representation Learning via Topology Transformations (TopoTER) | TKDE'21 | Masked link prediction | Node classification; graph classification; link prediction | link |
Directed Graph Auto-Encoders (DiGAE) | AAAI'22 | Link prediction | (Directed) link prediction | link |
GPPT: Graph Pre-training and Prompt Tuning to Generalize Graph Neural Networks | KDD'22 | Masked link prediction | Node classification | link |
Link Prediction with Contextualized Self-Supervision (CSSL2) | TKDE'22 | Link prediction | Link prediction | link |
Interpretable Node Representation with Attribute Decoding (NORAD) | TMLR'22 | Link prediction | Node classification; node clustering; link prediction | -- |
S2GAE: Self-Supervised Graph Autoencoders are Generalizable Learners with Graph Masking | WSDM'23 | Masked link prediction | Node classification; graph classification; link prediction | link |
Dual Low-Rank Graph Autoencoder for Semantic and Topological Networks (DLR-GAE) | AAAI'23 | Link prediction | Node classification | link |
Heterogeneous Graph Masked Autoencoders (HGMAE) | AAAI'23 | Masked link prediction | (Heterogeneous) node classification; node clustering | link |
Deep Manifold Graph Auto-Encoder for Attributed Graph Embedding (DMGAE, DMVGAE) | ICASSP'23 | Link prediction | Node clustering; link prediction | -- |
Learning Fair Graph Representations via Automated Data Augmentations (Graphair) | ICLR'23 | Masked link prediction | Node classification | link |
Multi-head Variational Graph Autoencoder Constrained by Sum-product Networks (SPN-MVGAE) | WWW'23 | Link prediction | Node classification; link prediction | link (unavailable) |
SeeGera: Self-supervised Semi-implicit Graph Variational Auto-encoders with Masking | WWW'23 | Masked link prediction | Node classification; link prediction; attribute prediction | link |
Graph-Aware Language Model Pre-Training on a Large Graph Corpus Can Help Multiple Graph Applications (GALM) | KDD'23 | Link prediction | (Heterogeneous) node classification; link prediction; edge classification | -- |
DiP-GNN: Discriminative Pre-Training of Graph Neural Networks | NeurIPS Workshop (GLFrontiers)'23 | Masked link prediction | Node classification; link prediction | -- |
Maximizing Mutual Information Across Feature and Topology Views for Representing Graphs (MVMI-FT) | TKDE'23 | Link prediction | Node classification; node clustering | link |
Towards Effective and Robust Graph Contrastive Learning With Graph Autoencoding (AEGCL) | TKDE'23 | Link prediction | Node classification; node clustering; link prediction | link |
ULTRA-DP: Unifying Graph Pre-training with Multi-task Graph Dual Prompt | arXiv:2310 | Link prediction | Node classification; link prediction | link |
Incomplete Graph Learning via Attribute-Structure Decoupled Variational Auto-Encoder (ASD-VAE) | WSDM'24 | Edge feature prediction | Node classification; node attribute completion | link |
Generative and Contrastive Paradigms Are Complementary for Graph Self-Supervised Learning (GCMAE1) | ICDE'24 | Link prediction | Node classification; node clustering; graph classification; link prediction | link |
DiscoGNN: A Sample-Efficient Framework for Self-Supervised Graph Representation Learning | ICDE'24 | Replaced edge feature prediction | Graph classification; similarity search | link |
Decoupled Variational Graph Autoencoder for Link Prediction (D-VGAE) | WWW'24 | Link prediction | Node classification; node clustering; link prediction | link |
Masked Graph Autoencoder with Non-discrete Bandwidths (Bandana) | WWW'24 | Link denoising | Node classification; link prediction | link |
OpenGraph: Towards Open Graph Foundation Models | EMNLP Findings'24 | Masked link prediction | Node classification; link prediction | link |
HC-GAE: The Hierarchical Cluster-based Graph Auto-Encoder for Graph Representation Learning | NeurIPS'24 | Link prediction | Node classification; graph classification | -- |
Redundancy Is Not What You Need: An Embedding Fusion Graph Auto-Encoder for Self-Supervised Graph Representation Learning (EFGAE) | TNNLS'24 | Link prediction | Node classification | -- |
AnyGraph: Graph Foundation Model in the Wild | arXiv:2408 | Link prediction | Node classification; graph classification; link prediction | link |
Hierarchical Vector Quantized Graph Autoencoder with Annealing-Based Code Selection (HQA-GAE) | WWW'25 | Link prediction | Node classification; link prediction | link |
Context
- Context discrimination: to distinguish between contextual nodes and non-contextual nodes. LE stands for Laplacian Eigenmaps objective
- Contextual subgraph discrimination: to distinguish between representations aggregated from different contextual subgraphs (maybe from different receptive fields). CE stands for cross-entropy
- Context feature prediction: node feature prediction but to reconstruct the features of k-hop neighbors instead
- Contextual property prediction: to predict the properties of contextual subgraphs (e.g. node / edge types contained, total node / edge counts, structural coefficient)
Paper | Venue | Pretext | Downstream | Code |
---|---|---|---|---|
Inductive Representation Learning on Large Graphs (GraphSAGE) | NIPS'17 | Context discrimination (JS) | Node classification | link |
Strategies for Pre-training Graph Neural Networks (ContextPred) | ICLR'20 | Contextual subgraph discrimination (CE) | Graph classification; biological function prediction | link |
GraphZoom: A Multi-level Spectral Approach for Accurate and Scalable Graph Embedding | ICLR'20 | Contextual subgraph discrimination (CE) | Node classification; link prediction | link |
GCC: Graph Contrastive Coding for Graph Neural Network Pre-Training | KDD'20 | Contextual subgraph discrimination (InfoNCE) | Node classification; graph classification; similarity search | link |
Graph Attention Auto-Encoders (GATE) | ICTAI'20 | Context discrimination (JS) | Node classification | link |
Sub-Graph Contrast for Scalable Self-Supervised Graph Representation Learning (Subg-Con) | ICDM'20 | Context discrimination (Triplet margin) | Node classification | link |
Self-Supervised Graph Transformer on Large-Scale Molecular Data (GROVER) | NeurIPS'20 | Contextual property prediction | Graph classification; graph regression | link |
Pre-training on Large-Scale Heterogeneous Graph (PT-HGNN) | KDD'21 | Context discrimination (InfoNCE) | (Heterogeneous) node classification; link prediction | link |
Transfer Learning of Graph Neural Networks with Ego-graph Information Maximization (EGI) | NeurIPS'21 | Context discrimination (JS) | Role identification; relation prediction | link |
Contrastive Laplacian Eigenmaps (COLES) | NeurIPS'21 | Context discrimination (LE) | Node classification; node clustering | link |
Graph-MLP: Node Classification without Message Passing in Graph | arXiv:2106 | Context discrimination (InfoNCE) | Node classification | link |
Augmentation-Free Self-Supervised Learning on Graphs (AFGRL) | AAAI'22 | Context discrimination (Bootstrapping) | Node classification; node clustering; similarity search | link |
Simple Unsupervised Graph Representation Learning (SUGRL) | AAAI'22 | Context discrimination (Triplet margin) | Node classification | link |
SAIL: Self-Augmented Graph Contrastive Learning | AAAI'22 | Neighbor feature prediction (BPR) | Node classification; node clustering; link prediction | -- |
Robust Self-Supervised Structural Graph Neural Network for Social Network Prediction | WWW'22 | Contextual subgraph discrimination (InfoNCE) | Node classification; graph classification; similarity search | -- |
Node Representation Learning in Graph via Node-to-Neighbourhood Mutual Information Maximization (N2N) | CVPR'22 | Context discrimination (InfoNCE) | Node classification | link |
RoSA: A Robust Self-Aligned Framework for Node-Node Graph Contrastive Learning | IJCAI'22 | Contextual subgraph discrimination (InfoNCE) | Node classification | link |
Graph Auto-Encoder Via Neighborhood Wasserstein Reconstruction (NWR-GAE) | ICLR'22 | Context feature prediction | Node classification; structural role identification | link |
Towards Self-supervised Learning on Graphs with Heterophily (HGRL) | CIKM'22 | Context discrimination (InfoNCE) | Node classification; node clustering | link |
Unifying Graph Contrastive Learning with Flexible Contextual Scopes (UGCL) | ICDM'22 | Context discrimination (InfoNCE) | Node classification | link |
Generalized Laplacian Eigenmaps (GLEN) | NeurIPS'22 | Context discrimination (LE) | Node classification; node clustering | link |
Decoupled Self-supervised Learning for Graphs (DSSL) | NeurIPS'22 | Context discrimination (Other) | Node classification | link |
Towards Graph Self-Supervised Learning with Contrastive Adjusted Zooming (G-Zoom) | TNNLS'22 | Context discrimination (JS) | Node classification | -- |
Link Prediction with Contextualized Self-Supervision (CSSL2) | TKDE'22 | Context discrimination (CE) | Link prediction | link |
Graph Soft-Contrastive Learning via Neighborhood Ranking (GSCL) | arXiv:2209 | Context discrimination (InfoNCE) | Node classification; node clustering | -- |
Localized Graph Contrastive Learning (Local-GCL) | arXiv:2212 | Context discrimination (InfoNCE) | Node classification | link |
Deep Graph Structural Infomax (DGSI) | AAAI'23 | Context discrimination (JS) | Node classification | link |
Neighbor Contrastive Learning on Learnable Graph Augmentation (NCLA) | AAAI'23 | Context discrimination (InfoNCE) | Node classification | link |
Eliciting Structural and Semantic Global Knowledge in Unsupervised Graph Contrastive Learning (S3-CL) | AAAI'23 | Contextual subgraph discrimination (InfoNCE) | Node classification; node clustering | link |
GraphPrompt: Unifying Pre-Training and Downstream Tasks for Graph Neural Networks; Generalized Graph Prompt: Toward a Unification of Pre-Training and Downstream Tasks on Graphs (GraphPrompt+) | WWW'23; TKDE'24 | Context discrimination (InfoNCE), etc | Node classification; graph classification | link |
Contrastive Learning Meets Homophily: Two Birds with One Stone (NeCo) | ICML'23 | Context discrimination (InfoNCE) | Node classification | -- |
Contrastive Cross-scale Graph Knowledge Synergy (CGKS) | KDD'23 | Context discrimination (LE); contextual subgraph discrimination (InfoNCE) | Node classification; graph classification | -- |
Pretraining Language Models with Text-Attributed Heterogeneous Graphs (THLM) | EMNLP Findings'23 | Context discrimination (InfoNCE) | (Heterogeneous) node classification; link prediction | link |
GRENADE: Graph-Centric Language Model for Self-Supervised Representation Learning on Text-Attributed Graphs | EMNLP Findings'23 | Context discrimination (InfoNCE, KL) | Node classification; node clustering; link prediction | link |
Simple and Asymmetric Graph Contrastive Learning without Augmentations (GraphACL) | NeurIPS'23 | Context discrimination (InfoNCE) | Node classification | link |
Better with Less: A Data-Active Perspective on Pre-Training Graph Neural Networks (APT) | NeurIPS'23 | Context discrimination (InfoNCE) | Node classification; graph classification | link |
Dual Contrastive Learning Network for Graph Clustering (DCLN) | TNNLS'23 | Context discrimination (InfoNCE) | Node classification; node clustering | link |
Hierarchical Topology Isomorphism Expertise Embedded Graph Contrastive Learning (HTML) | AAAI'24 | Contextual property prediction (structural coefficient) | Graph classification | link |
HGPROMPT: Bridging Homogeneous and Heterogeneous Graphs for Few-shot Prompt Learning | AAAI'24 | Contextual subgraph discrimination (InfoNCE) | (Heterogeneous) node classification; graph classification | link |
Graph Contrastive Learning Reimagined: Exploring Universality (ROSEN) | WWW'24 | Context discrimination (InfoNCE) | Node classification; node clustering | -- |
High-Frequency-aware Hierarchical Contrastive Selective Coding for Representation Learning on Text-attributed Graphs (HASH-CODE) | WWW'24 | Context discrimination (SP); contextual subgraph discrimination (SP) | Node classification; link prediction | -- |
HeterGCL: Graph Contrastive Learning Framework on Heterophilic Graph | IJCAI'24 | Context discrimination (InfoNCE) | Node classification; node clustering | link |
S3GCL: Spectral, Swift, Spatial Graph Contrastive Learning | ICML'24 | Context discrimination (InfoNCE) | Node classification | link |
Efficient Contrastive Learning for Fast and Accurate Inference on Graphs (GraphECL) | ICML'24 | Context discrimination (InfoNCE) | Node classification | link |
Self-Pro: A Self-Prompt and Tuning Framework for Graph Neural Networks | ECML-PKDD'24 | Context discrimination (InfoNCE) | Node classification; link prediction | link |
Smoothed Graph Contrastive Learning via Seamless Proximity Integration (SGCL4) | LoG'24 | Context discrimination (cosine similarity) | Node classification; graph classification | link |
FUG: Feature-Universal Graph Contrastive Pre-training for Graphs with Diverse Node Features | NeurIPS'24 | Context discrimination (MSE) | Node classification | link |
TAGA: Text-Attributed Graph Self-Supervised Learning by Synergizing Graph and Text Mutual Transformations | arXiv:2405 | Contextual subgraph discrimination (cosine similarity) | Node classification | -- |
Single-View Graph Contrastive Learning with Soft Neighborhood Awareness (SIGNA) | AAAI'25 | Context discrimination (JS) | Node classification; node clustering | link |
Balancing Graph Embedding Smoothness in Self-supervised Learning via Information-Theoretic Decomposition (BSG) | WWW'25 | Context discrimination (MSE, cosine similarity) | Node classification; link prediction | link (private) |
Long-range similarities
- Similarity prediction: to predict a similarity matrix between nodes. The pairwise similarity can be defined by shortest path distance, PageRank similarity, Katz index, Jaccard coefficient,
$\ell_2$ distance & cosine similarity between output representations / input-output, etc - Similarity-based discrimination: instance discrimination that is node similarity-aware
- Similarity graph alignment: to construct an additional similarity graph based on pairwise similarities of node features or graph topology, and minimize the distance of representation distributions between them (the original and similarity graph, or two different similarity graphs)
Paper | Venue | Pretext | Downstream | Code |
---|---|---|---|---|
Adaptive Graph Encoder for Attributed Graph Embedding (AGE) | KDD'20 | Similarity prediction (cosine similarity) | Node clustering; link prediction | link |
AM-GCN: Adaptive Multi-channel Graph Convolutional Networks | KDD'20 | Similarity graph alignment | Node classification | link |
Graph-Bert: Only Attention is Needed for Learning Graph Representations | arXiv:2001 | Similarity prediction (PageRank, etc.) | Node classification; node clustering | link |
Self-supervised Learning on Graphs: Deep Insights and New Direction (PairwiseDistance, PairwiseAttrSim) | arXiv:2006 | Similarity prediction (shortest path distance; cosine similarity) | Node classification | link |
SAIL: Self-Augmented Graph Contrastive Learning | AAAI'22 | Similarity prediction (cosine similarity) | Node classification; node clustering; link prediction | -- |
Co-Modality Graph Contrastive Learning for Imbalanced Node Classification (CM-GCL) | NeurIPS'22 | Similarity-based discrimination (cosine similarity) | Node classification (imbalanced) | link |
Self-Supervised Graph Representation Learning via Global Context Prediction; A New Self-supervised Task on Graphs: Geodesic Distance Prediction (S2GRL) | Information Sciences'22 | Similarity prediction (shortest path distance) | Node classification; node clustering; link prediction | -- |
Dual Low-Rank Graph Autoencoder for Semantic and Topological Networks (DLR-GAE) | AAAI'23 | Similarity graph alignment | Node classification | link |
Attribute and Structure Preserving Graph Contrastive Learning (ASP) | AAAI'23 | Similarity graph alignment | Node classification | link |
Beyond Smoothing: Unsupervised Graph Representation Learning with Edge Heterophily Discriminating (GREET) | AAAI'23 | Similarity-based discrimination (cosine similarity) | Node classification | link |
Deep Manifold Graph Auto-Encoder for Attributed Graph Embedding (DMGAE, DMVGAE) | ICASSP'23 | Similarity prediction ( |
Node clustering; link prediction | -- |
Self-Supervised Teaching and Learning of Representations on Graphs (GraphTL) | WWW'23 | Similarity-based discrimination (cosine similarity) | Node classification | -- |
Graph Self-supervised Learning via Proximity Divergence Minimization (PDM) | UAI'23 | Similarity prediction (heat kernel, personalized PageRank, SimRank) | Node classification | link |
Maximizing Mutual Information Across Feature and Topology Views for Representing Graphs (MVMI-FT) | TKDE'23 | Similarity graph alignment | Node classification; node clustering | link |
Towards Effective and Robust Graph Contrastive Learning With Graph Autoencoding (AEGCL) | TKDE'23 | Similarity graph alignment | Node classification; node clustering; link prediction | link |
ULTRA-DP: Unifying Graph Pre-training with Multi-task Graph Dual Prompt | arXiv:2310 | Similarity prediction (cosine similarity) | Node classification; link prediction | link |
Deep Contrastive Graph Learning with Clustering-Oriented Guidance (DCGL) | AAAI'24 | Similarity graph alignment | Node clustering | link |
E2GCL: Efficient and Expressive Contrastive Learning on Graph Neural Networks | ICDE'24 | Similarity-based discrimination | Node classification; graph classification; link prediction | -- |
Improving Graph Contrastive Learning via Adaptive Positive Sampling (HEATS) | CVPR'24 | Similarity-based discrimination (block diagonal affinity) | Node classification; image classification | -- |
ConGraT: Self-Supervised Contrastive Pretraining for Joint Graph and Text Embeddings | ACL Workshop (TextGraphs)'24 | Similarity-based discrimination (common neighbors, SimRank) | Node classification; link prediction | link |
Enhancing Graph Contrastive Learning with Node Similarity (SimEnhancedGCL) | KDD'24 | Similarity-based discrimination (cosine similarity, personalized PageRank) | Node classification | link |
Select Your Own Counterparts: Self-Supervised Graph Contrastive Learning With Positive Sampling (GPS) | TNNLS'24 | Similarity-based discrimination (cosine similarity, personalized PageRank, etc) | Node classification | -- |
UniGraph2: Learning a Unified Embedding Space to Bind Multimodal Graphs | WWW'25 | Similarity prediction (shortest path distance) | Node classification; link prediction; edge classification | link (unavailable) |
Motifs
- Motif prediction: to assign each node (or supernode in the fragment graph) a motif pseudo-label given by unsupervised motif discovery algorithms (e.g. RDKit) and learn to predict them. It is "autoregressive" if the predicted supernodes are generated one-by-one
- Motif-based masked feature prediction: similar to masked feature prediction, but the features are masked in motifs
- Motif-based discrimination: to perform contrast between the original graph view and the fragment graph view
Papers | Venue | Pretext | Downstream | Code |
---|---|---|---|---|
Self-Supervised Graph Transformer on Large-Scale Molecular Data (GROVER) | NeurIPS'20 | Motif prediction | Graph classification; graph regression | link |
Motif-based Graph Self-Supervised Learning for Molecular Property Prediction (MGSSL) | NeurIPS'21 | Motif prediction (autoregressive) | Graph classification | link |
Fragment-based Pretraining and Finetuning on Molecular Graphs (GraphFP) | NeurIPS'23 | Motif prediction; motif-based discrimination (InfoNCE) | Graph classification; graph regression | link |
Motif-aware Riemannian Graph Neural Network with Generative-Contrastive Learning (MotifRGC) | AAAI'24 | Motif-based discrimination (InfoNCE) | Node classification; link prediction | link |
Empowering Dual-Level Graph Self-Supervised Pretraining with Motif Discovery (DGPM) | AAAI'24 | Motif prediction | Graph classification | link |
Graph Contrastive Learning with Cohesive Subgraph Awareness (CTAug) | WWW'24 | Motif-based discrimination (InfoNCE) | Graph classification | link |
Motif-aware Attribute Masking for Molecular Graph Pre-training (MoAMa) | LoG'24 | Motif-based masked feature prediction | Graph classification | link |
Motif-Driven Contrastive Learning of Graph Representations (MICRO-Graph) | TKDE'24 | Motif-based discrimination (InfoNCE) | Graph classification | link |
Fine-grained Semantics Enhanced Contrastive Learning for Graphs (FSGCL) | TKDE'24 | Motif-based discrimination (Bootstrapping) | Node classification | -- |
Clusters
- Synthetic graph discrimination: binary classification between two synthetic graphs with different synthesizers (Erdős-Rényi generator / SBM generator)
- Node clustering: to assign each node a cluster centroid (prototype) and - i) minimize the distance between nodes and their corresponding centroids in the latent space; or ii) minimize the distance between the learned centroids and the ground-truth centroids given by unsupervised feature clustering algorithms (e.g. K-means, DeepCluster)
- Graph partitioning: to assign each node a cluster centroid (prototype) and - i) predict the quality of the learned partitions evaluated by some metrics, e.g. maximizing modularity or minimizing the normalized edge weights of a graph cut (spectral clustering); or ii) predict the cluster membership of each node given by unsupervised graph partitioning algorithms (structure-based, e.g. METIS, Louvain)
- Cluster/partition-based instance discrimination: instance discrimination that is aware of graph clustering/partitioning memberships
- Cluster/partition-conditioned link prediction: to maximize the log-likelihood of existing links, but conditioned by the graph cluster/partition distributions
- Partition-conditioned masked link prediction: similar to masked link prediction, but the links are masked in clusters
Paper | Venue | Pretext | Downstream | Code |
---|---|---|---|---|
SGR: Self-Supervised Spectral Graph Representation Learning | KDD Workshop (DLD)'18 | Synthetic graph discrimination | Graph classification | -- |
Unsupervised Pre-training of Graph Convolutional Networks (ClusterDetect) | ICLR Workshop (RLGM)'19 | Graph partitioning | Node classification | -- |
Multi-Stage Self-Supervised Learning for Graph Convolutional Networks on Graphs with Few Labeled Nodes (M3S) | AAAI'20 | Node clustering | Node classification | link |
Collaborative Graph Convolutional Networks: Unsupervised Learning Meets Semi-Supervised Learning (CGCN) | AAAI'20 | Partition-conditioned link prediction | Node classification; node clustering | link (unavailable) |
When Does Self-Supervision Help Graph Convolutional Networks? (NodeCluster, GraphPar) | ICML'20 | Node clustering; graph partitioning | Node classification | link |
CommDGI: Community Detection Oriented Deep Graph Infomax | CIKM'20 | Cluster-based discrimination (JS); graph partitioning | Node clustering | link |
Dirichlet Graph Variational Autoencoder (DGVAE) | NeurIPS'20 | Partition-conditioned link prediction | Graph generation; node clustering | link |
Self-supervised Learning on Graphs: Deep Insights and New Direction (Distance2Clusters) | arXiv:2006 | Graph partitioning | Node classification | link |
Mask-GVAE: Blind Denoising Graphs via Partition | WWW'21 | Graph partitioning; partition-conditioned masked link prediction | Node clustering; graph denoising | link |
Self-supervised Graph-level Representation Learning with Local and Global Structure (GraphLoG) | ICML'21 | Node clustering | Graph classification; biological function prediction | link |
Graph Communal Contrastive Learning (gCooL) | WWW'22 | Partition-based discrimination (InfoNCE) | Node classification; node clustering | link |
Self-supervised Heterogeneous Graph Pre-training Based on Structural Clustering (SHGP) | NeurIPS'22 | Graph partitioning | (Heterogeneous) node classification; node clustering | link |
Eliciting Structural and Semantic Global Knowledge in Unsupervised Graph Contrastive Learning (S3-CL) | AAAI'23 | Cluster-based discrimination (InfoNCE) | Node classification; node clustering | link |
CSGCL: Community-Strength-Enhanced Graph Contrastive Learning | IJCAI'23 | Partition-based discrimination (InfoNCE) | Node classification; node clustering; link prediction | link |
HomoGCL: Rethinking Homophily in Graph Contrastive Learning | KDD'23 | Node clustering; cluster-based discrimination (InfoNCE) | Node classification; node clustering | link |
CARL-G: Clustering-Accelerated Representation Learning on Graphs | KDD'23 | Node clustering | Node classification; node clustering; similarity search | link |
Towards Alignment-Uniformity Aware Representation in Graph Contrastive Learning (AUAR) | WSDM'24 | Node clustering | Node classification; node clustering | -- |
Deep Contrastive Graph Learning with Clustering-Oriented Guidance (DCGL) | AAAI'24 | Cluster-based discrimination (InfoNCE) | Node clustering | link |
StructComp: Substituting propagation with Structural Compression in Training Graph Contrastive Learning | ICLR'24 | Partition-based discrimination (JS, InfoNCE, etc.) | Node classification | link |
MARIO: Model Agnostic Recipe for Improving OOD Generalization of Graph Contrastive Learning | WWW'24 | Cluster-based discrimination | Node classification; graph classification | link |
Graph Contrastive Learning with Kernel Dependence Maximization for Social Recommendation (CL-KDM) | WWW'24 | Partition-based discrimination (BPR) | Recommendation | -- |
HeterGCL: Graph Contrastive Learning Framework on Heterophilic Graph | IJCAI'24 | Cluster-based discrimination (MSE) | Node classification; node clustering | link |
Community-Invariant Graph Contrastive Learning (CI-GCL) | ICML'24 | Partition-based discrimination (InfoNCE) | Graph classification; graph regression | link |
From Coarse to Fine: Enable Comprehensive Graph Self-supervised Learning with Multi-granular Semantic Ensemble (MGSE) | ICML'24 | Node clustering | Graph classification | link |
Revisiting Self-Supervised Heterogeneous Graph Learning from Spectral Clustering Perspective (SCHOOL) | NeurIPS'24 | Partition-based discrimination (MSE) | (Heterogeneous) node classification; node clustering | link |
Motif-Driven Contrastive Learning of Graph Representations (MICRO-Graph) | TKDE'24 | Graph partitioning | Graph classification | link |
Global structure
- Graph instance discrimination: to discriminate between global representations of different graph views (generally for small-scale graphs)
- Graph dimension discrimination: dimension discrimination of different graph representations
- Node-graph discrimination: instance discrimination between the representation of each node and a global representation vector, usually aggregated from the whole graph by a readout function
- Group discrimination: a simplified node-graph discrimination that binarily classifies if a node belongs to the original or the perturbed graph
- Graph similarity prediction: to predict various kinds of similarity functions between pairs of graphs, e.g. graph kernels (graphlet kernel, random walk kernel, graph edit distance kernel, etc)
- Half-graph matching: to divide each graph into two halves and predict if two halves are from the same original graph
Paper | Venue | Pretext | Downstream | Code |
---|---|---|---|---|
Pre-training Graph Neural Networks with Kernels (KernelPred) | arXiv:1811 | Graph similarity prediction | Graph classification | -- |
Deep Graph InfoMax (DGI) | ICLR'19 | Node-graph discrimination (JS) | Node classification | link |
InfoGraph: Unsupervised and Semi-supervised Graph-Level Representation Learning via Mutual Information Maximization | ICLR'20 | Node-graph discrimination (JS) | Graph classification | link |
Graph Contrastive Learning with Augmentations (GraphCL) | NeurIPS'20 | Graph instance discrimination (InfoNCE) | Graph classification | link |
Contrastive Multi-View Representation Learning on Graphs (MVGRL) | ICML'20 | Node-graph discrimination (JS) | Node classification; graph classification | link |
Contrastive Self-supervised Learning for Graph Classification (CSSL1) | AAAI'21 | Graph instance discrimination (InfoNCE) | Graph classification | -- |
SUGAR: Subgraph Neural Network with Reinforcement Pooling and Self-Supervised Mutual Information Mechanism | WWW'21 | Node-graph discrimination (JS) | Graph classification | link |
Pairwise Half-graph Discrimination: A Simple Graph-level Self-supervised Strategy for Pre-training Graph Neural Networks (PHD); An Effective Self-Supervised Framework for Learning Expressive Molecular Global Representations to Drug Discovery (MPG) | IJCAI'21; Briefings in Bioinformatics'21 | Half-graph matching | Graph classification | link |
Graph Contrastive Learning Automated (JOAO) | ICML'21 | Graph instance discrimination (InfoNCE) | Graph classification | link |
Adversarial Graph Augmentation to Improve Graph Contrastive Learning (AD-GCL) | NeurIPS'21 | Graph instance discrimination (InfoNCE) | Graph classification | link |
InfoGCL: Information-Aware Graph Contrastive Learning | NeurIPS'21 | Graph instance discrimination (Bootstrapping); node-graph discrimination (Bootstrapping) | Node classification; graph classification | -- |
Graph Adversarial Self-Supervised Learning (GASSL) | NeurIPS'21 | Graph instance discrimination (Bootstrapping) | Graph classification | link (unavailable) |
Disentangled Contrastive Learning on Graphs (DGCL) | NeurIPS'21 | Graph instance discrimination (Other) | Graph classification | link |
Bringing Your Own View: Graph Contrastive Learning without Prefabricated Data Augmentations (GraphCL-LP) | WSDM'22 | Graph instance discrimination (InfoNCE) | Graph classification | link |
Self-Supervised Graph Neural Networks via Diverse and Interactive Message Passing (DIMP) | AAAI'22 | Node-graph discrimination (JS) | Node classification; node clustering; graph classification | link |
Unsupervised Adversarially Robust Representation Learning on Graphs (GRV) | AAAI'22 | Node-graph discrimination (JS) | Node classification; node clustering; link prediction | link |
AutoGCL: Automated Graph Contrastive Learning via Learnable View Generators | AAAI'22 | Graph instance discrimination (InfoNCE) | Graph classification | link |
Group Contrastive Self-Supervised Learning on Graphs (GroupCL; GroupIG) | TPAMI'22 | Graph instance discrimination (JS; contrastive log-ratio upper bound (CLUB)) | Graph classification | -- |
Towards Graph Self-Supervised Learning with Contrastive Adjusted Zooming (G-Zoom) | TNNLS'22 | Node-graph discrimination (JS) | Node classification | -- |
SimGRACE: A Simple Framework for Graph Contrastive Learning without Data Augmentation | WWW'22 | Graph instance discrimination (InfoNCE, Bootstrapping) | Graph classification | link |
Let Invariant Rationale Discovery Inspire Graph Contrastive Learning (RGCL1) | ICML'22 | Graph instance discrimination (InfoNCE) | Graph classification | link |
M-Mix: Generating Hard Negatives via Multi-sample Mixing for Contrastive Learning | KDD'22 | Graph instance discrimination (InfoNCE) | Node classification; node clustering; graph classification; graph edit distance prediction | link |
AdaGCL: Adaptive Subgraph Contrastive Learning to Generalize Large-scale Graph Training | CIKM'22 | Node-graph discrimination (JS) | Node classification | link |
Rethinking and Scaling Up Graph Contrastive Learning: An Extremely Efficient Approach with Group Discrimination (GGD) | NeurIPS'22 | Group discrimination | Node classification | link |
Graph Self-supervised Learning with Accurate Discrepancy Learning (D-SLA) | NeurIPS'22 | Group discrimination; graph similarity prediction | Graph classification; link prediction | link |
Deep Graph Structural Infomax (DGSI) | AAAI'23 | Node-graph discrimination (JS) | Node classification | link |
Spectral Augmentation for Self-Supervised Learning on Graphs (SPAN) | ICLR'23 | Node-graph discrimination (InfoNCE) | Node classification; graph classification; graph regression | link |
Mole-BERT: Rethinking Pre-training Graph Neural Networks for Molecules | ICLR'23 | Graph instance discrimination (InfoNCE; Triplet margin) | Graph classification; graph regression | link |
Spectral Augmentations for Graph Contrastive Learning (SGCL1) | AISTATS'23 | Graph instance discrimination (InfoNCE) | Node classification; graph classification; similarity search | -- |
Generating Counterfactual Hard Negative Samples for Graph Contrastive Learning (CGC) | WWW'23 | Graph instance discrimination (InfoNCE) | Graph classification | link |
Multi-Scale Subgraph Contrastive Learning (MSSGCL) | IJCAI'23 | Node-graph discrimination (InfoNCE); graph instance discrimination (InfoNCE) | Graph classification | link |
Boosting Graph Contrastive Learning via Graph Contrastive Saliency (GCS) | ICML'23 | Graph instance discrimination (InfoNCE) | Graph classification | link |
SEGA: Structural Entropy Guided Anchor View for Graph Contrastive Learning | ICML'23 | Graph instance discrimination (InfoNCE) | Graph classification | link |
Randomized Schur Complement Views for Graph Contrastive Learning (rLap) | ICML'23 | Node-graph discrimination (InfoNCE); graph instance discrimination (InfoNCE) | Graph classification | link |
Graph Self-Contrast Representation Learning (GraphSC) | ICDM'23 | Graph instance discrimination (Triplet margin); graph dimension discrimination | Graph classification | -- |
Graph Contrastive Learning with Stable and Scalable Spectral Encoding (Sp2GCL) | NeurIPS'23 | Graph instance discrimination (InfoNCE) | Node classification; graph classification; graph regression | link |
Certifiably Robust Graph Contrastive Learning (RES) | NeurIPS'23 | Graph instance discrimination (InfoNCE) | Graph classification | link |
Maximizing Mutual Information Across Feature and Topology Views for Representing Graphs (MVMI-FT) | TKDE'23 | Node-graph discrimination (JS) | Node classification; node clustering | link |
Multi-Scale Self-Supervised Graph Contrastive Learning With Injective Node Augmentation (MS-CIA) | TKDE'23 | Node-graph discrimination (JS) | Node classification | -- |
Hierarchically Contrastive Hard Sample Mining for Graph Self-Supervised Pretraining (HCHSM) | TNNLS'23 | Node-graph discrimination (JS) | Node classification; node clustering | link |
Dual Contrastive Learning Network for Graph Clustering (DCLN) | TNNLS'23 | Node-graph discrimination (JS) | Node classification; node clustering | link |
HeGCL: Advance Self-Supervised Learning in Heterogeneous Graph-Level Representation | TNNLS'23 | Node-graph discrimination (JS) | (Heterogeneous) node classification; graph classification | link |
Affinity Uncertainty-Based Hard Negative Mining in Graph Contrastive Learning (AUGCL) | TNNLS'23 | Graph instance discrimination (InfoNCE) | Graph classification | link |
Hierarchical Topology Isomorphism Expertise Embedded Graph Contrastive Learning (HTML) | AAAI'24 | Graph instance discrimination (InfoNCE); graph similarity prediction (Jaccard coef-based isomorphic similarity) | Graph classification | link |
TopoGCL: Topological Graph Contrastive Learning | AAAI'24 | Graph instance discrimination (InfoNCE) | Graph classification | link |
DiscoGNN: A Sample-Efficient Framework for Self-Supervised Graph Representation Learning | ICDE'24 | Graph instance discrimination (InfoNCE) | Graph classification; similarity search | link |
Masked Graph Modeling with Multi-View Contrast (GCMAE2) | ICDE'24 | Graph instance discrimination (InfoNCE) | Node classification; graph classification; link prediction | link |
SGCL: Semantic-aware Graph Contrastive Learning with Lipschitz Graph Augmentation (SGCL3) | ICDE'24 | Graph instance discrimination (InfoNCE) | Graph classification | -- |
Graph Contrastive Learning with Reinforcement Augmentation (GA2C) | IJCAI'24 | Graph instance discrimination (InfoNCE) | Graph classification | -- |
Disentangled Graph Self-supervised Learning for Out-of-Distribution Generalization (OOD-GCL) | ICML'24 | Graph instance discrimination (InfoNCE) | Graph classification | -- |
Uncovering Capabilities of Model Pruning in Graph Contrastive Learning (LAMP1) | MM'24 | Graph instance discrimination (InfoNCE) | Graph classification | -- |
A Sample-driven Selection Framework: Towards Graph Contrastive Networks with Reinforcement Learning (GraphSaSe) | MM'24 | Graph instance discrimination (InfoNCE) | Graph classification | link (unavailable) |
Graph Contrastive Learning with Personalized Augmentation (GPA) | TKDE'24 | Graph instance discrimination (InfoNCE) | Graph classification | link |
Graph Contrastive Learning with Min-Max Mutual Information (GCLMI) | Information Sciences'24 | Graph instance discrimination (InfoNCE) | Graph classification | link |
Towards Graph Foundation Models: Learning Generalities Across Graphs via Task-Trees (GIT) | arXiv:2412 | Graph instance discrimination (Bootstrapping) | Node classification; graph classification; link prediction; edge classification | link (unavailable) |
SAMGPT: Text-free Graph Foundation Model for Multi-domain Pre-training and Cross-domain Adaptation | WWW'25 | Graph instance discrimination (InfoNCE) | Node classification; graph classification | link |
Graph Self-Supervised Learning with Learnable Structural and Positional Encodings (StructPosGSSL) | WWW'25 | Graph instance discrimination (InfoNCE); graph dimension discrimination | Graph classification | link (private) |
Manifolds
- Cross-manifold discrimination: to perform instance discrimination between different manifolds (e.g. Euclidean vs. Hyperbolic)
- Hyperbolic masked prediction: to perform masked feature/link prediction in hyperbolic space
- Hyperbolic angle prediction: to pool representations to 2-dimensional angle vectors in a unit hyperbola. These vectors serve as pseudo-labels for regression
Paper | Venue | Pretext | Downstream | Code |
---|---|---|---|---|
Enhancing Hyperbolic Graph Embeddings via Contrastive Learning (HGCL) | NeurIPS Workshop (SSL)'21 | Cross-manifold discrimination (InfoNCE) | Node classification | -- |
A Self-supervised Mixed-curvature Graph Neural Network (SelfMGNN) | AAAI'22 | Cross-manifold discrimination (InfoNCE) | Node classification | -- |
Dual Space Graph Contrastive Learning (DSGC) | WWW'22 | Cross-manifold discrimination (InfoNCE) | Graph classification | link |
Graph-level Representation Learning with Joint-Embedding Predictive Architectures (GraphJEPA) | arXiv:2309 | Hyperbolic angle prediction | Graph classification; graph regression | link |
Motif-aware Riemannian Graph Neural Network with Generative-Contrastive Learning (MotifRGC) | AAAI'24 | Cross-manifold discrimination (InfoNCE) | Node classification; link prediction | link |
Graph Representation Learning in Hyperbolic Space via Dual-Masked (HDM-GAE) | COLING'25 | Hyperbolic masked prediction | Node classification; link prediction | -- |
RiemannGFM: Learning a Graph Foundation Model from Structural Geometry | WWW'25 | Cross-manifold discrimination (InfoNCE) | Node classification; link prediction | link |
Multi-task pre-training
- Multi-task learning: to combine a set of different pre-training tasks with bespoke algorithms / architectures
Paper | Venue | Strategy | Downstream | Code |
---|---|---|---|---|
Adaptive Transfer Learning on Graph Neural Networks (AUX-TS) | KDD'21 | Multi-task learning | Node classification; link prediction | link |
Automated Self-Supervised Learning for Graphs (AutoSSL) | ICLR'22 | Multi-task learning | Node classification; node clustering | link |
Automated Graph Self-supervised Learning via Multi-teacher Knowledge Distillation (AGSSL) | arXiv:2210 | Multi-task learning | Node classification | -- |
Multi-task Self-supervised Graph Neural Networks Enable Stronger Task Generalization (ParetoGNN) | ICLR'23 | Multi-task learning | Node classification; node clustering; graph partition; link prediction | link |
ULTRA-DP: Unifying Graph Pre-training with Multi-task Graph Dual Prompt | arXiv:2310 | Multi-task learning | Node classification; link prediction | link |
Decoupling Weighing and Selecting for Integrating Multiple Graph Pre-training Tasks (WAS) | ICLR'24 | Multi-task learning | Node classification; graph classification | link |
MultiGPrompt for Multi-Task Pre-Training and Prompting on Graphs | WWW'24 | Multi-task learning | Node classification; graph classification | link |
Exploring Correlations of Self-Supervised Tasks for Graphs (GraphTCM) | ICML'24 | Multi-task learning | Node classification; link prediction | link |
UniGM: Unifying Multiple Pre-trained Graph Models via Adaptive Knowledge Aggregation | MM'24 | Multi-task learning | Graph classification | link |
Downstream tuning
- Fine-tuning: to jointly learn downstream branches as well as the original pre-trained model. Parameter-efficient fine-tuning (PEFT) only updates part of the pre-trained model, e.g. adapter layers or pruned networks
- Prompting: to construct task-specific prompts as model input for downstream tuning/prompting.
❤️ Contributions by issues and pull requests to this source list are always welcome! Feel free to initiate a discussion with me, or give me a reminder if there are oversights of papers/hyperlinks or categorical mistakes.