You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/user_guide/01_Reading_data.md
+8-6Lines changed: 8 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -2,11 +2,12 @@
2
2
3
3
## The DeepForest data model
4
4
5
-
The DeepForest data model has three components
5
+
The DeepForest data model has four components:
6
6
7
-
1. Annotations are stored as dataframes. Each row is an annotation with a single geometry and label. Each annotation dataframe must contain a 'image_path', which is the relative, not full path to the image, and a 'label' column.
7
+
1. Annotations are stored as dataframes. Each row is an annotation with a single geometry and label. Each annotation dataframe must contain a 'image_path', which is the basename, not full path to the image, and a 'label' column.
8
8
2. Annotation geometry is stored as a shapely object, allowing the easy movement among Point, Polygon and Box representations.
9
9
3. Annotations are expressed in image coordinates, not geographic coordinates. There are utilities to convert geospatial data (.shp, .gpkg) to DeepForest data formats.
10
+
4. A root_dir attribute that specifies where the images are stored. A Dee
10
11
11
12
## The read_file function
12
13
DeepForest has collated many use cases into a single `read_file` function that will read many common data formats, both projected and unprojected, and create a dataframe ready for DeepForest functions that fits the DeepForest data model.
@@ -16,18 +17,19 @@ DeepForest has collated many use cases into a single `read_file` function that w
16
17
```
17
18
from deepforest import utilities
18
19
19
-
df = utilities.read_file("annotations.csv", image_path="<full path to the image>", label="Tree")
20
+
df = utilities.read_file("annotations.csv", root_dir="directory containing images", image_path="relative path to the image>", label="Tree")
20
21
```
21
22
22
-
For files that lack an `image_path` or `label` column, pass the `image_path` or `label` argument.
23
+
For files that lack an `image_path` or `label` column, pass the `image_path` or `label` argument. This applies the same image_path and label for the entire file, and is not appropriate for multi-image files.
23
24
24
25
```python
25
26
from deepforest import utilities
26
27
27
28
gdf = utilities.read_file(
28
29
input="/path/to/annotations.shp",
29
-
image_path="/path/to/OSBS_029.tif", # required if no image_path column
30
-
label="Tree"# optional: used if no 'label' column in the shapefile
30
+
image_path="OSBS_029.tif", # required if no image_path column
31
+
root_dir="path/to/images/"# required is image_path argument is used
32
+
label="Tree"# optional: used if no 'label' column in the shapefile
"No image_path column found in dataframe and image_path argument not specified, please specify full path to image file in image_path argument: read_file(input=df, image_path='/path/to/image.tif', ...)"
215
+
"No image_path column found in GeoDataframe and image_path argument not specified, please specify the root_dir and image_path arguements: read_file(input=df, root_dir='path/to/images/', image_path='image.tif', ...)"
f"Image file {full_image_path} not found, please check the image_path argument, it should be the full path: read_file(input=df, image_path='/path/to/image.tif', ...)"
214
-
)
221
+
if"image_path"ingdf.columns:
222
+
existing_image_path=gdf.image_path.unique()[0]
223
+
iflen(existing_image_path) >1:
224
+
warnings.warn(
225
+
f"Multiple image_paths found in dataframe: {existing_image_path}, overriding and assigning {image_path} to all rows!",
226
+
stacklevel=2,
227
+
)
228
+
ifexisting_image_path!=image_path:
229
+
warnings.warn(
230
+
f"Image path {existing_image_path} found in dataframe, overriding and assigning {image_path} to all rows!",
231
+
stacklevel=2,
232
+
)
233
+
gdf["image_path"] =image_path
234
+
else:
235
+
gdf["image_path"] =image_path
215
236
216
-
returnfull_image_path
237
+
returngdf
217
238
218
239
219
240
def__shapefile_to_annotations__(
220
-
gdf: str|gpd.GeoDataFrame,
221
-
root_dir: str|None=None,
222
-
image_path: str|None=None,
241
+
gdf: gpd.GeoDataFrame,
223
242
) ->gpd.GeoDataFrame:
224
243
"""Convert geospatial annotations to DeepForest format.
225
244
226
245
Args:
227
-
gdf: A GeoDataFrame with a geometry column and an image_path column. If the image_path column is not present, it will be added using the image_path argument.
228
-
image_path: Full path to the image file.
229
-
root_dir: Root directory of the image files. If not provided, it will be inferred from the image_path column.
246
+
gdf: A GeoDataFrame with a geometry column and an image_path column.
230
247
231
248
Returns:
232
249
GeoDataFrame with annotations in DeepForest format.
image_path: Assign image_path column to all rows. The full path to the image file.
502
-
label: Assign a single label column to all rows.
535
+
image_path: Path relative to root_dir to a single image that will be assigned as the image_path column for all annotations. The full path will be constructed by joining the root_dir and the image_path. Overrides any image_path column in input.
536
+
label: Single label to be assigned as the label for all annotations. Overrides any label column in input.
503
537
504
-
Notes:
505
-
The image_path and label arguments are applied to all rows in the dataframe or shapefile and therefore should only be used in cases where all rows have the same image_path and label.
506
538
Returns:
507
539
GeoDataFrame with geometry, image_path, and label columns
508
540
"""
541
+
# Check arguments
542
+
ifimage_pathisnotNoneandroot_dirisNone:
543
+
raiseValueError(
544
+
"root_dir argument must be specified if image_path argument is used"
0 commit comments