Graph neural networks and transportation #
Background #
In most of the work we have done so far, the workflow has been simple. From some feature vectors, predict either a number or a class for the underlying objects. But there are settings where it is possible to use additional structure to make even better predictions.
Graphs can be directed or undirected and are natural models for relationships. For instance, the graph below models friendships in a social network. It has never been clear to me whether a friendship graph should be directed or undirected.
image by Gemini
Suppose that in addition to being given a feature vector for each node, we are given the graph of relationships between them. An active area of research is how to best use the graph structure to augment our predictions. There are numerous applications including predicting shapes of proteins, supply chain logistics, stock price prediction, social recommenders, and modeling transportation networks. The goal of this project is to learn about a novel type of a neural network which naturally understands the graph structure of your data.
GNNs #
I will not be able to do graph neural networks justice in one or two paragraphs; this study has become a field that is as diverse at it is important. The key idea is to somehow take advantage of the graph structure to inform how a prediction is made from the feature vector of each node in the graph. There are many ways to do this.
-
Begin by reading the A Gentle Introduction to Graph Neural Networks. Additional resources include the proto-book Geometric Deep Learning, and Stanford’s CS224w. You goal is to understand the various ways message passing between nodes can occur and how one take advantage of the underlying graph structure.
-
You may choose any data set to study using GNNs, but I would love to see the analysis of some transportation network; things like roads and subways come with a very natural graph structure. Some examples include Boston T data; Caltrans Performance Measurement System; Bluebikes data. The PeMS and METR-LA datasets are available via direct import in Python. City2Graph is a Python library bridging the gap between raw transit data and GNN-ready tensors. Your favorite large language model will help you find others.
image from r/boston
-
There a number of interesting questions your project can address. I don’t want to stifle your creativity, but here are some to think about:
- Demand Forecasting: “How many bikes will be docked at the Kendall Square station in 30 minutes?”
- Role Identification: “Based on usage patterns, which stations function as ‘Transit Hubs’ vs. ‘Residential Spokes’?
- Congestion Propagation: “If a ‘slow zone’ appears on the Red Line, which connected edges in the network are most likely to experience a ‘secondary delay’?”
-
There are a couple of software libraries you can use to build GNNs. PyG and DGL are standard, but I hear that Spektral is very simple to use. Feel free to try them all. If coding from scratch is not your thing, you should be able to generate fairly good code using an LLM of your choice, just make sure you know what it is doing!
-
In your writeup, make sure to introduce the reader to graph neural networks and especially the particular flavor you chose for your project, and detail your experiment. What is the relationship of your work to existing research in this area?