Implement NASA flight data loader by ThreeMonth03 · Pull Request #696 · solvcon/modmesh

ThreeMonth03 · 2026-03-17T16:08:56Z

I implement the data loader of NASA flight dataset to help us finish the future work in issue #685. As for the usage of data loader, you can see the following command and understand how to get data from a iterator.

(base) threemonth03@threemonthwin:~/Downloads/modmesh$ python3 modmesh/track/load.py
Loaded 120791 events.
Event 0:
  timestamp=1.60259601021904e+18
  source=ground_truth
  data=GroundTruthData(con_pos_in_ecef=array([-1387897.36558835, -5268929.31643981,  3306577.65409484]), con_vel_in_ecef=array([ 0.00220006, -0.0081095 ,  0.00414972]), quat_con_ecef=array([-0.45840088, -0.1767584 ,  0.51147502,  0.70499532]))

Event 1:
  timestamp=1.60259601022904e+18
  source=ground_truth
  data=GroundTruthData(con_pos_in_ecef=array([-1387897.36564722, -5268929.31643027,  3306577.65410693]), con_vel_in_ecef=array([ 0.00197762, -0.00815516,  0.00414352]), quat_con_ecef=array([-0.45840118, -0.17675819,  0.51147449,  0.70499557]))

Event 2:
  timestamp=1.6025960102293e+18
  source=imu
  data=IMUData(dvel_in_imu=array([-0.18792725, -0.00048828, -0.04785156]), dangle_in_imu=array([-1.90734863e-06,  3.81469727e-06,  1.90734863e-06]))

Event 3:
  timestamp=1.60259601023904e+18
  source=ground_truth
  data=GroundTruthData(con_pos_in_ecef=array([-1387897.36570593, -5268929.31642279,  3306577.65411988]), con_vel_in_ecef=array([ 0.00190321, -0.00814863,  0.00428433]), quat_con_ecef=array([-0.45840138, -0.17675799,  0.51147416,  0.70499573]))

Event 4:
  timestamp=1.60259601024904e+18
  source=ground_truth
  data=GroundTruthData(con_pos_in_ecef=array([-1387897.36576642, -5268929.31641798,  3306577.65413258]), con_vel_in_ecef=array([ 0.00201493, -0.00820102,  0.00423451]), quat_con_ecef=array([-0.45840157, -0.17675801,  0.51147413,  0.70499562]))

ThreeMonth03

@yungyuc Please review this pull request when you are available. Thanks.

ThreeMonth03 · 2026-03-17T16:10:05Z

modmesh/track/load.py

+@dataclass
+class GroundTruthData:
+    """
+    Data structure for ground truth data.
+    """
+    con_pos_in_ecef: np.ndarray
+    con_vel_in_ecef: np.ndarray
+    quat_con_ecef: np.ndarray
+
+
+@dataclass
+class IMUData:
+    """
+    Data structure for IMU data.
+    """
+    dvel_in_imu: np.ndarray
+    dangle_in_imu: np.ndarray
+
+
+@dataclass
+class LidarData:
+    """
+    Data structure for Lidar data.
+    """
+    range_m: np.ndarray
+    doppler_speed_mps: np.ndarray


Data structure from different sensors.

ThreeMonth03 · 2026-03-17T16:11:18Z

modmesh/track/load.py

+@dataclass(order=True)
+class SensorEvent:
+    """
+    A generic sensor event with a timestamp,
+    source identifier, and data payload.
+    """
+    timestamp: float
+    source: str = field(compare=False)
+    data: Any = field(compare=False)


A generic sensor event with a timestamp,
source identifier, and data payload.

ThreeMonth03 · 2026-03-17T16:13:33Z

modmesh/track/load.py

+    def load(self):
+        """
+        Load imu, lidar, and ground-truth data,
+        then sort events by timestamp.
+        """
+        self.load_imu()
+        self.load_lidar()
+        self.load_ground_truth()
+        self.events.sort()
+


When calling this function, this function will load data from different files, then sort the data based on the order of timestamps.

ThreeMonth03 · 2026-03-17T16:14:15Z

modmesh/track/load.py

+def main():
+    dataset = FlightDataset()
+    dataset.load()
+    print(f"Loaded {len(dataset)} events.")
+    it = dataset.iter_events()
+    for i in range(5):
+        event = next(it)
+        print(f"Event {i}:")
+        print(f"  timestamp={event.timestamp}")
+        print(f"  source={event.source}")
+        print(f"  data={event.data}")
+        print()


A simple usage to load dataset by iterator.

Why do you ever need to have this CLI in the first place? It looks like just an example. If you only need an example, why don't you just create unit tests?

yungyuc

There was already a script track/dataset.py, and this PR adds a new one. They are not integrated while they should be. It is also questionable that why do you need CLI for load.py in this PR.

Points to address:

Clarify why the CLI is ever needed.
Add unit tests. The code does not look like friendly to testing. You may need to think about how to restructure to make pretty testing.
Do not directly import module content.
Clarify if it is better to make FlightDataset an iterator.
Name mangling is not necessary. Use _load_text() instead.

The code quality is not very good. We could need more discussions.

yungyuc · 2026-03-18T23:33:47Z

modmesh/track/load.py

+from dataclasses import dataclass, field
+import numpy as np
+from pathlib import Path
+from typing import Any, Iterable, Optional


Do not directly import module content. That is, do not from ... import .... Use only import ... for all modules and entities.

import typing, and then consume the contents like typing.Any, instead of from typing import Any; Any. The former removes the module name at the consuming site.

yungyuc · 2026-03-18T23:37:27Z

modmesh/track/load.py

+    dataset = FlightDataset()
+    dataset.load()
+    print(f"Loaded {len(dataset)} events.")
+    it = dataset.iter_events()


What is preventing you from using enumerate()?

for i, event in enumerate(dataset): if i > 5: break

Making dataset an iterator is more straigh-forward.

There are around 130000 event data points, I'm not sure whether it is a good idea without iterator.

enumerate() uses the iterator protocol as well.

yungyuc · 2026-03-18T23:38:30Z

modmesh/track/load.py

+    data: Any = field(compare=False)
+
+
+class FlightDataset:


It is not a good idea for high-performance computing to rely on looping in Python. As a prototype, it's fine. But eventually you will need to get this thing into C++ for speed and saving memory.

yungyuc · 2026-03-18T23:39:37Z

modmesh/track/load.py

+        """
+        return self.events[idx]
+
+    def iter_events(self, sources: Optional[str | Iterable[str]] = None):


Typing information in Python will get in your way when moving the code into C++.

Let Python be dynamic. If you need typing information, it is a sign that the code needs to be in C++ soon.

yungyuc · 2026-03-18T23:40:21Z

modmesh/track/load.py

+        Iterate over events of one or more sources.
+
+        :param sources: Event source(s). If None, iterate over all events.
+        Sources can be a single source string or an iterable of source strings.


Sphinx docstring format is off.

yungyuc · 2026-03-18T23:41:01Z

modmesh/track/load.py

+        """
+        if sources is None:
+            yield from self.events
+            return


Why do you return after yeild? Is it ever reaching return?

yungyuc · 2026-03-18T23:42:12Z

modmesh/track/load.py

+        Load imu data.
+        """
+        data = self.__load_text(self.imu_csv)
+        for row in data:


May be very slow compare to C++. I would guess 20-100x.

yungyuc · 2026-03-18T23:42:52Z

modmesh/track/load.py

+            )
+            self.events.append(event)
+
+    def __load_text(self, path):


Name mangling is not necessary. Use _load_text() instead.

yungyuc · 2026-03-18T23:49:35Z

modmesh/track/load.py

+def main():
+    dataset = FlightDataset()
+    dataset.load()
+    print(f"Loaded {len(dataset)} events.")
+    it = dataset.iter_events()
+    for i in range(5):
+        event = next(it)
+        print(f"Event {i}:")
+        print(f"  timestamp={event.timestamp}")
+        print(f"  source={event.source}")
+        print(f"  data={event.data}")
+        print()


Why do you ever need to have this CLI in the first place? It looks like just an example. If you only need an example, why don't you just create unit tests?

yungyuc · 2026-03-20T12:21:19Z

@ThreeMonth03 update?

ThreeMonth03 commented Mar 17, 2026

View reviewed changes

yungyuc requested changes Mar 18, 2026

View reviewed changes

yungyuc changed the title ~~Implement NASA flight data loader.~~ Implement NASA flight data loader Mar 18, 2026

ThreeMonth03 force-pushed the data_processing branch 2 times, most recently from 6fb857f to 0ef7fc3 Compare March 20, 2026 16:09

Implement NASA flight data loader.

022c53d

ThreeMonth03 force-pushed the data_processing branch from 0ef7fc3 to 022c53d Compare March 20, 2026 16:10

Conversation

ThreeMonth03 commented Mar 17, 2026

Uh oh!

ThreeMonth03 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

yungyuc left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

yungyuc commented Mar 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants