Master's Thesis Research
This repository contains the data engineering pipeline required to convert the Waymo Open Motion Dataset into multi-dimensional, deep-learning-ready tensors optimized for Controllable Diffusion Models (MotionDiffuser).
- Raw Ingestion: Deserializes Google Protocol Buffers (
.tfrecord). - Ego-Centric Normalization: Translates global GPS coordinate systems to an Ego-centric reference frame
(0,0), applying rotational matrices to align agent trajectory vectors. - Tensor Engineering: Pads variable traffic scenarios into uniform
[64, 91, 6]tensors, integrating binary Valid Masks for Transformer attention layers.
To generate the final .npy tensors:
cd src/
python3 build_tensors.py