To run the experiments, you need to have LibTorch installed. Download it from the following link:
The datasets required for the experiments can be downloaded from the following Dropbox link:
After downloading, follow these steps:
- Create a
datafolder in the root directory of your project (./). - Move all downloaded files to
./data/.
Here are some visualizations of the dataset distributions used in the experiments:
Ensure that the experiment configurations are correctly set up by checking the \exp_config folder. Adjust the configurations as necessary for your experiments. For example:
{
"experiments": [
{
"available": true,
"data": {
"size": 100000000,
"dimensions": 2,
"distribution": "us",
"skewness": 1,
"bounds": [
[0, 1],
[0, 1]
]
},
"workloads": [
"point_query_only.json",
"range_query_only.json",
"knn_query_only.json"
],
"baseline": [
{
"name": "rankspace",
"available": false,
"config": {
"fill_factor": 1.0,
"page_size": 100,
"bit_num": 32
}
},
{
"name": "kdgreedy",
"available": true,
"config": {
"page_size": 100
}
}
]
}
]
}Explanation:
- experiments: An array containing experiment configurations.
- available: A boolean indicating whether the experiment is available to run.
- data: Describes the dataset used in the experiment.
- size: The number of data points in the dataset.
- dimensions: The number of dimensions (features) in the dataset.
- distribution: The distribution type of the dataset (e.g., "us" for U.S. region-based distribution).
- skewness: The skewness level of the data distribution, with
1indicating a specific skewness degree. - bounds: The range of values for each dimension in the dataset, given as an array of min-max pairs.
- workloads: A list of workload files specifying the types of queries to be executed (e.g., point, range, k-NN queries).
- baseline: An array of baseline methods used for comparison in the experiment.
- name: The name of the baseline method.
- available: A boolean indicating whether the baseline method is available for the experiment.
- config: Configuration parameters specific to the baseline method.
- fill_factor: (For rankspace) The fill factor of the index structure.
- page_size: The size of each page (node) in the index.
- bit_num: (For rankspace) The number of bits used in the rank space method.
-
Install Extended Libspatialindex:
- Follow the instructions in the Installation Guide to install the extended version of
libspatialindex.
- Follow the instructions in the Installation Guide to install the extended version of
-
Verify Installation:
- Run
check_env.shto verify thatlibspatialindexis correctly installed.
- Run
-
Update Environment Variables:
- Replace the following line in your environment setup:
export LD_LIBRARY_PATH=/home/liuguanli/Documents/libtorch/lib:$LD_LIBRARY_PATH
- with the path to your own installed
libtorchlibrary.
- Replace the following line in your environment setup:
-
Configure and Run Experiments:
- In
run_exp_from_config.py, setRUN_EXAMPLE=Trueif you want to run the example configurations. - To run experiments:
- Use
point_range_knn_queriesfor all query-only workloads. - Use
["write_only", "read_heavy_only", "write_heavy_only"]for insertion-related workloads.
- Use
- In
def main():
global logger
configs = []
if RUN_EXAMPLE:
if RUN_ALL_BASELINE_EXAMPLE:
configs = ["example_config_all_baselines.json",
"example_config_all_baselines_insert.json",
"example_config_all_baselines_read_heavy.json",
"example_config_all_baselines_write_heavy.json"]
configs = ["example_config_all_baselines_point_rank_space_100m.json"]
else: # for debug specific index
configs = ["example_config_debug_bmtree.json"]
else:
directory = CONFIG_DIR
# First run point_range_knn_queries to make sure queries are generated first for RL based.
special_candidate = "point_range_knn_queries"
for root, dirs, files in os.walk(directory):
if root.split("/")[-1] == special_candidate:
for file in files:
if file.endswith(".json"):
config_file_path = os.path.join(root, file)
configs.append(config_file_path)
candidates = ["write_only", "read_heavy_only", "write_heavy_only"]
for root, dirs, files in os.walk(directory):
if root.split("/")[-1] not in candidates:
continue
for file in files:
if file.endswith(".json"):
config_file_path = os.path.join(root, file)
configs.append(config_file_path)To run all the experiments, simply execute the following command in your terminal:
bash run_all.sh










































