Quantum‑Kernel‑Molecular‑Classification

Comparing classical graph kernels and quantum‑inspired embeddings for molecular classification — Team The Cat’s Cradle

QPoland Global Quantum Hackathon 2025 18–26 Oct 2025 Graph ML Quantum‑Inspired Ego‑Graph QW QURI Parts

Abstract

This report benchmarks classical graph kernels against quantum‑inspired embeddings for molecular classification on TU datasets loaded via PyTorch Geometric’s TUDataset, ensuring consistent preparation across AIDS, PROTEINS, NCI1, PTC‑MR, and MUTAG.

The proposed ego‑graph quantum‑walk (Ego‑QW) feature map caps qubit width by encoding local neighborhoods with compact, Trotterized circuits using QURI simulators; Feature Vector is extracted as expectation readout of list of simple observables.

A complementary full‑graph continuous‑time quantum walk (CTQW) embedding summarizes global transport using exact unitary evolution \( \psi(t)=e^{-iHt}\psi(0) \) with adjacency or Laplacian Hamiltonians, evaluated efficiently at multiple times via eigen‑decomposition.

Evaluation uses SVM with an RBF kernel and nested cross‑validation to report mean ± std of Accuracy and \(F1\), enabling a fair comparison between local quantum features, global walk summaries, and classical baselines (Shortest‑Path, WL‑subtree).

Innovation. Fix qubit‑count scaling while still capturing functional‑group topology via ego‑graphs, enabling efficient, accurate feature extraction and improved classification on molecular benchmarks.

1. Introduction

The objective is to design an original graph feature map \( \phi(G) \) that captures molecular structure for SVM classification with a chosen kernel, and to evaluate performance on TU datasets using a standard cross‑validation protocol; datasets are loaded via PyTorch Geometric’s TUDataset for consistent preparation and metadata.

Quantum‑walk encodings on full graphs scale qubits with nodes, motivating an ego‑graph approach that preserves discriminative local dynamics while capping circuit width for practical QURI‑based simulation and potential hardware alignment.

3. Methods

3.1 Ego‑QW (Proposed)

3.2 CTQW (Full Graph)

3.3 Classical Baselines

4. Datasets

AIDS, PROTEINS, NCI1, PTC‑MR, and MUTAG are accessed through PyTorch Geometric’s TUDataset, which automates download and processing; conversions to NetworkX with to_networkx provide integer node labels for pipelines that require label diagonals in \(H\) or observables.

Sample graphs from NCI1 dataset with node labels and edge structures
Representative NCI1 samples illustrating ring patterns and substituents that influence label‑distance statistics and WL color evolution.

5. Implementation Details

5.1 Data loading and label normalization (PyG → NetworkX)

TUDataset(root, name) provides consistent loaders; to_networkx converts to NetworkX; node labels are derived as argmax over one‑hot x when present (else zeros) and nodes are relabeled to 0..n‑1 for array‑aligned computation in QW routines.

5.2 Graph preprocessing for walks

For CTQW, adjacency \(A\), degrees \(D\), and \(L=D-A\) are built from NetworkX; for QURI ego‑QW, a radius‑\(r\) ego‑graph is extracted and truncated to max_nodes, preserving neighborhood topology while fitting qubit limits.

5.3 Ego sampling

A degree‑biased sampling strategy selects half the centers from top‑degree nodes and the rest from remaining nodes, improving coverage of hubs and typical vertices under a small ego budget for runtime stability.

5.4 QURI ego‑QW circuits

The register is initialized to \( |\!+\rangle^{\otimes n} \) via Hadamards; per Trotter slice, diagonal label phases use \(R_Z\) gates and edges use RXX/RYY composites implementing \(e^{-i\gamma \Delta t (X_iX_j+Y_iY_j)}\); statevectors are obtained via evaluate_state_to_vector and normalized before expectation calculations.

5.5 Trotter product formula (first‑order)

The first‑order product formula \( e^{-i(H_1+H_2)t} \approx \left(e^{-iH_1 \Delta t} e^{-iH_2 \Delta t}\right)^{t/\Delta t} \) is a standard tool for non‑commuting operator exponentials; higher‑order formulas reduce error with more gates.

5.6 State padding and slice‑safe expectations

Ego states are padded or truncated to max_nodes to match observable canvases; a slice‑safe \( \langle O \rangle=\psi^\dagger O \psi \) computes with \(k=\min(\lvert \psi\rvert, \lvert O\rvert)\), preventing dimension errors across graphs of varying ego sizes.

5.7 CTQW eigen‑decomposition

Multi‑time evolution uses the spectral decomposition \(H=V\Lambda V^\dagger\) with \( \psi(t)=V e^{-it\Lambda} V^\dagger \psi(0) \), avoiding repeated exponentials; CTQW mixing and spreading properties are analyzed via time‑averaged distributions rather than stationary convergence.

5.8 Baselines

Shortest‑Path builds labeled distance histograms; WL‑subtree performs 1‑WL color refinement and accumulates color counts for \(h\) iterations; both align with established graph kernel literature and strong TU baselines.

5.9 Classifier and nested CV

Features are standardized and fed to an SVM with RBF \(K(x,y)=\exp(-\gamma\lVert x-y\rVert^2)\); \(C,\gamma\) are selected with an inner grid in nested cross‑validation for unbiased outer‑fold metrics (Accuracy, \(F1\)).

5.10 Reproducibility and performance

Random seeds are fixed for Python/NumPy/PyTorch; max_graphs_per_dataset bounds wall‑time; small time grids and Trotter steps stabilize QURI ego‑QW runtime, while CTQW’s \(O(n^3)\) eigen‑decomposition amortizes over multiple times.

5.11 Failure modes and fixes

In QURI, always extract vectors via evaluate_state_to_vector(...).vector and normalize before expectations; ensure padding/slice‑safe expectations for ego states vs fixed‑size observable matrices to avoid matmul shape errors.

5.12 Project notebooks (src/)

The src directory contains the primary notebooks to explore data, reproduce models, and visualize circuits for this work.

Note: Ego‑graph decomposition and neighborhood‑wise quantum processing let fixed‑size circuits summarize large graphs by pooling local expectations into graph‑level features.

6. Results

Bar chart of mean accuracy across datasets for QURI, CTQW, SP, and WL
Mean Accuracy across datasets and methods (QURI, CTQW, SP, WL) with identical SVM‑RBF nested CV settings.
Bar chart of mean F1-score across datasets for QURI, CTQW, SP, and WL
Mean \( F1 \) across datasets and methods, illustrating complementary strengths of local quantum and global CTQW features.
MethodDatasetAccuracy (Mean ± Std)F1-score (Mean ± Std)
QURI Ego‑QWAIDS0.9650 ± 0.02550.9662 ± 0.0237
CTQWAIDS0.9950 ± 0.01000.9948 ± 0.0103
Shortest‑PathAIDS0.9600 ± 0.02000.9598 ± 0.0207
WL SubtreeAIDS0.9300 ± 0.02450.9193 ± 0.0309
QURI Ego‑QWMUTAG0.8772 ± 0.03360.8793 ± 0.0324
CTQWMUTAG0.8514 ± 0.02550.8527 ± 0.0269
Shortest‑PathMUTAG0.7872 ± 0.01690.7812 ± 0.0161
WL SubtreeMUTAG0.7818 ± 0.02730.7783 ± 0.0295

Replace placeholders with your computed nested‑CV statistics from runs to finalize this table.

7. Discussion

Ego‑QW captures local interference patterns and short‑hop structure under qubit caps, while CTQW captures global connectivity via spectral dynamics; SP and WL remain strong classical baselines encoding geodesic and refined neighborhood structure.

Future hybrid designs may combine local quantum features with global classical summaries, or leverage higher‑order Trotterization and compilation to improve fidelity‑vs‑depth trade‑offs in circuit‑based encodings.

8. Limitations and Future Work

Our innovation

Full‑graph quantum walk embeddings scale qubits with nodes, so we pivot to ego‑graphs that bound circuit width while preserving discriminative local walk dynamics around chemically meaningful centers.

Each ego‑subgraph is encoded by a compact Trotterized circuit using QURI Parts, features are extracted from observables, and aggregated over time points and centers to yield a fixed‑size vector per molecule.

This decouples hardware width from global graph size by capping qubits at the max ego size, enabling NISQ‑feasible circuits without losing salient local patterns.

9. Team

Team The Cat’s Cradle acknowledges the following contributors and contacts for this project.

NameDiscord Handle
Jajapuram Shiva Sai@frosty
Amon Koike@thedaemon_AK
Ramesh Makwana@Ramesh Makwana
Dr. Sushant Tapase@Dr-Sushant

10. Reproduce

Open the src folder, choose the required notebook (.ipynb), and click “Open in Colab” to run.

References

  1. X. Ai et al., “Towards Quantum Graph Neural Networks: An Ego‑Graph Learning Approach,” arXiv:2201.05158v3 (2024). HTML.
  2. D. Aharonov, A. Ambainis, J. Kempe, U. Vazirani, “Quantum Walks on Graphs,” STOC’01; arXiv:quant‑ph/0012090. PDF.
  3. C. Kluber, “Trotterization in Quantum Theory,” arXiv:2310.13296. PDF.
  4. N. Shervashidze et al., “Weisfeiler‑Lehman Graph Kernels,” JMLR 12, 2539–2561 (2011). PDF.
  5. K. M. Borgwardt, H.‑P. Kriegel, “Shortest‑path kernels on graphs,” ICDM 2005. PDF.
  6. PyTorch Geometric TUDataset. docs.
  7. PyTorch Geometric utils (to_networkx, conversions). utils.
  8. QURI Parts simulator and states. simulator, states.
  9. Continuous‑time quantum walk overview. overview.
  10. scikit‑learn SVC and RBF kernel. SVC, rbf_kernel.

Back to top