Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
32 changes: 30 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -55,8 +55,10 @@ Note: Apart from validation, **vehicle_type** and **vehicle_speed** information

# Important Resources

* Input format: [Link](assets/docs/Tracely%20I_O%20Structure%20-%20input.pdf)
* Output format: [Link](assets/docs/Tracely%20I_O%20Structure%20-%20output.pdf)
* Input format for CleanTrace Class: [Link](assets/docs/Tracely%20I_O%20Structure%20-%20input%20for%20CleanTrace%20class.pdf)
* Input format for calculate_trace_similarity method: [Link](assets/docs/Tracely%20I_O%20Structure%20-%20input%20for%20calculate_trace_similarity%20method.pdf)
* Output format of get_trace_cleaning_output method: [Link](assets/docs/Tracely%20I_O%20Structure%20-%20output%20of%20get_trace_cleaning_output%20method.pdf)
* Output format of calculate_trace_similarity method: [Link](assets/docs/Tracely%20I_O%20Structure%20-%20output%20of%20calculate_trace_similarity%20method.pdf)
* Exception handling document: [Link](assets/docs/Tracely%20I_O%20Structure%20-%20exception_handling.pdf)
* Function's documentation: [Link](assets/docs/functions_documentation.md)

Expand All @@ -71,6 +73,7 @@ Note: Apart from validation, **vehicle_type** and **vehicle_speed** information
>>> ./install_tracely.sh
>>> python -m examples.trace_cleaning_example
>>> python -m examples.stop_summary_example
>>> python -m examples.trace_similarity_example
```


Expand Down Expand Up @@ -135,6 +138,31 @@ Additionally, we have also provided a helper script `install_osrm.sh` using whic
* Example map illustration
![Dual Map for stop points](assets/images/stop_points_image.png)

* User can calculate the similarity between two traces and optionally visualize the traces using the `calculate_trace_similarity` method. Example usage:
```python
from tracely.trace_similarity import calculate_trace_similarity

# Define two traces with their respective latitudes, longitudes, and timestamps
trace_1 = [
[28.6139, 77.2090, 1706874347094],
[28.7041, 77.1025, 1706874348094],
# Add more points as needed
]
trace_2 = [
[28.6140, 77.2095, 1706874347094],
[28.7030, 77.1030, 1706874348094],
# Add more points as needed
]

# Define thresholds for distance and time
distance_threshold = 50 # meters
time_threshold = 1000 # milliseconds

# Calculate trace similarity
similarity_result = calculate_trace_similarity(trace_1, trace_2, distance_threshold, time_threshold, plot_map=True)
```
* Example map illustration
![Dual Map for traces](assets/images/trace_overlap_image.png)

# Contact
In case of any issues or suggestions, reach out at: tracely@delhivery.com
Binary file modified assets/docs/Tracely I_O Structure - exception_handling.pdf
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file removed assets/docs/Tracely I_O Structure - input.pdf
Binary file not shown.
Binary file not shown.
Binary file not shown.
46 changes: 46 additions & 0 deletions assets/docs/functions_documentation.md
Original file line number Diff line number Diff line change
Expand Up @@ -205,3 +205,49 @@ clean_trace = clean_output["cleaned_trace"]
clean_trace_obj.plot_cleaning_comparison_map(raw_trace, clean_trace)
```


## 4. Tracely has following function related to calculating similarity of two traces:

### 4.1 Calculate Trace Similarity

#### Description
The `calculate_trace_similarity` function enables users to measure the spatial and temporal similarity between two traces. It identifies overlapping pings within specified distance and time thresholds, computes similarity percentages for both traces, and optionally visualizes the traces and their overlap on a map. Haversine method is used to calculate distance between pings.

#### Parameters
- **trace_1** *(`list`)*: The first trace, where each ping is represented as `[latitude, longitude, timestamp]`.
- **trace_2** *(`list`)*: The second trace, where each ping is represented as `[latitude, longitude, timestamp]`.
- **distance_threshold** *(`float`)*: Maximum distance (in meters) for two pings to be considered overlapping.
- **time_threshold** *(`int`)*: Maximum allowable time difference (in milliseconds) for two pings to overlap.
- **plot_map** *(`bool`, optional)*: If `True`, a folium map is generated to visualize the traces. Defaults to `False`.


#### Behavior
The function returns a dictionary:
- **`max_similarity_percentage`** *(`float`)*: Maximum of similarity percentages calculated between the two traces with respect to each other.
- **`metadata`** *(`dict`)*:
- **`similarity_info_trace_1_to_2`** *(`dict`)*:
- **`similarity_percentage`** *(`float`)*: Similarity percentage of `trace_1` with respect to `trace_2`.
- **`overlapping_pings_indices`** *(`list`)*: List containing index pairs of overlapping pings in `trace_1` and `trace_2` respectively. Each index pair represents a pair of points such that the ping from `trace_2` is closest to ping from `trace_1` while satisfying the time threshold.

- **`similarity_info_trace_2_to_1`** *(`dict`)*:
- **`similarity_percentage`** *(`float`)*: Similarity percentage of `trace_2` with respect to `trace_1`.
- **`overlapping_pings_indices`** *(`list`)*: List containing index pairs of overlapping pings in `trace_2` and `trace_1` respectively. Each index pair represents a pair of points such that the ping from `trace_1` is closest to ping from `trace_2` while satisfying the time threshold.
- **`plot`** *(`folium.plugins.DualMap` or None)*: Visualization of traces if `plot_map=True`, otherwise `None`.


#### Example
```python
from tracely.clean_trace import CleanTrace

# Example traces (list of [latitude, longitude, timestamp])
trace_1 = [[19.1, 73.0, 1700000000000], [19.2, 73.1, 1700000005000]]
trace_2 = [[19.15, 73.05, 1700000003000], [19.25, 73.15, 1700000010000]]

# Calculate similarity
result = CleanTrace.calculate_trace_similarity(trace_1, trace_2, distance_threshold=50, time_threshold=2000, plot_map=False)

# Output
max_similarity_percentage = result["max_similarity_percentage"]
metadata = result["metadata"]
plot = result["plot"]
```
Binary file added assets/images/trace_overlap_image.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
45 changes: 45 additions & 0 deletions examples/trace_similarity_example.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
import json

from src.tracely import constants
from src.tracely.trace_similarity import calculate_trace_similarity
from src.tracely.utils.utils import create_path
from tests.testing_utils import load_calculate_trace_similarity_payloads


# Example usage of CleanTrace for trace similarity calculation
if __name__ == "__main__":

# Load existing traces
payload = load_calculate_trace_similarity_payloads("large_trace_pair")

trace_1 = payload["trace_1"]
trace_2 = payload["trace_2"]

similarity_result = calculate_trace_similarity(trace_1,
trace_2,
distance_threshold = 100,
time_threshold = 10000,
plot_map = True)

# Get max_similarity_percentage and metadata from result
similarity_result_stats = {"max_similarity_percentage": similarity_result["max_similarity_percentage"],
"metadata": similarity_result["metadata"]}

# Get map plot from result
similarity_result_plot = similarity_result["plot"]

# Define paths
results_base_path = constants.BASE_PATH + "example_results/"
similarity_result_path = results_base_path + "similarity_result.json"
similarity_map_path = results_base_path + "similarity_map.html"
create_path(results_base_path)

print(f"Saving results at {results_base_path}")

# Dump similarity result statistics as json
dump_json_file_path = similarity_result_path
with open(dump_json_file_path, "w", encoding="utf-8") as json_file:
json.dump(similarity_result_stats, json_file, indent=4)

# Save map
similarity_result_plot.save(similarity_map_path)
3 changes: 2 additions & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -25,5 +25,6 @@ dependencies = [
"pandas==2.0.3",
"polyline==2.0.2",
"pytest==8.3.2",
"requests==2.32.3"
"requests==2.32.3",
"scipy==1.15.1"
]
3 changes: 1 addition & 2 deletions src/tracely/clean_trace.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,8 +11,7 @@
from .utils.utils import get_haversine_distance, \
calculate_change_in_direction, \
convert_unix_timestamp_to_human_readable, \
convert_time_interval_to_human_readable, \
calculate_trace_distance
convert_time_interval_to_human_readable

from .utils.plotting_utils import plot_raw_trace_from_trace_output, \
plot_clean_trace_from_trace_output, \
Expand Down
19 changes: 19 additions & 0 deletions src/tracely/constants.py
Original file line number Diff line number Diff line change
Expand Up @@ -221,6 +221,25 @@
]


TRACE_SIMILARITY_OUTPUT_KEYS = [
"max_similarity_percentage",
"metadata",
"plot"
]


TRACE_SIMILARITY_OUTPUT_METADATA_KEYS = [
"similarity_info_trace_1_to_2",
"similarity_info_trace_2_to_1"
]


TRACE_SIMILARITY_OUTPUT_METADATA_INDIVIDUAL_TRACE_SIM_INFO_KEYS = [
"similarity_percentage",
"overlapping_pings_indices"
]


#########################
# Interpolation constants
#########################
Expand Down
5 changes: 5 additions & 0 deletions src/tracely/exceptions/error_messages.py
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,7 @@ class ValidationErrorMessage:
INVALID_PING_BATCH_SIZE_FOR_MAP_MATCHING = "ping_batch_size cannot be less than 2"
INVALID_MIN_SIZE_FOR_STOP_EVENTS_DETECTION = "min_size cannot be less than 2"
INVALID_MAX_DIST_RATIO_FOR_IMPUTATION = "max_dist_ratio can not be less than 0, but got {}"
INVALID_LENGTH_OF_PING_FOR_OVERLAP_ESTIMATION = "Each ping in input traces must strictly contain 3 elements corresponding to latitude, longitude and timestamp."

FOUND_DUPLICATE_VALUES = "Expected values for '{}' to be unique, but found duplicate values"
MISSING_MANDATORY_KEY_IN_PING = "Expected key: {} missing from a ping dictionary"
Expand All @@ -55,6 +56,10 @@ class ValidationErrorMessage:
INCORRECT_INPUT_PINGS_COUNT = "total_non_null_pings_in_input cannot be greater than total_pings_in_input"
INCORRECT_STATUS_PERCENTAGES_SUM = "Sum of percentages of various update statuses should be at least 99.9"
INCORRECT_PINGS_COUNT_IN_CLEANING_SUMMARY = "total_pings_in_input in cleaning_summary must be equal to number of pings in input payload"
INCORRECT_SIMILARITY_PERCENTAGE = "similarity percentage for similarity between 2 traces must be in range [0, 100]"
INVALID_OVERLAPPING_INDICES_LIST_LENGTH = "Length of 'overlapping_pings_indices' must non-negative and less than the length of its corresponding trace."
INVALID_OVERLAPPING_INDEX_VALUE = "Value of index in 'overlapping_pings_indices' must non-negative and less than the length of its corresponding trace."
INVALID_INDEX_PAIR = "Each index pair in 'overlapping_pings_indices' must be a list of length 2."


class OSRMErrorCode:
Expand Down
Loading