OccPy tutorial notebook for Terrestrial Laser Scanning (TLS) data¶
This notebook gives a brief overview over TLS occlusion mapping using OccPy.
The data used for this analysis were acquired using a Riegl VZ400i laser scanner at the Rameren forest close to the WSL in Birmensdorf, Switzerland. The forest plot is characterized as a mixed temperate forest and the data was acquired under leaf-on conditions. The data has been heavily filtered to reduce storage needs and reduce processing time. The tutorial data is the same as used for the manuscript currently under revision in RSE and available as pre-print here: (TODO!)
First, we download the test data and configure paths
import json
import os
import shutil
from pathlib import Path
import pooch
from occpy.OccPy import OccPy
# Download test data once to cache and mirror it into the repository for stable relative paths
repo_root = Path('..', '..').resolve()
data_notebooks = repo_root / 'data_notebooks'
tls_data_dir = data_notebooks / 'TLS_demo'
data_notebooks.mkdir(parents=True, exist_ok=True)
p = pooch.create(
path=pooch.os_cache('occpy_test_data'),
base_url='https://zenodo.org/records/17750604/files/',
registry={'TLS_demo.zip': 'md5:ff15ef6b1b6e655e33c50020089dad56'},
)
p.fetch('TLS_demo.zip', processor=pooch.Unzip(members=['TLS_demo']), progressbar=True)
cache_data_dir = Path(p.path) / 'TLS_demo.zip.unzip' / 'TLS_demo'
if not tls_data_dir.exists():
shutil.copytree(cache_data_dir, tls_data_dir)
print(f'Notebook data folder: {tls_data_dir}')
config_file = str(repo_root / 'config' / 'settings_TLS_tutorial.JSON')
Updating file 'TLS_demo.zip' from 'https://zenodo.org/records/17750604/files/TLS_demo.zip' to '/home/wcherlet/.cache/occpy_test_data'. 100%|████████████████████████████████████████| 350M/350M [00:00<00:00, 417GB/s] Extracting 'TLS_demo' from '/home/wcherlet/.cache/occpy_test_data/TLS_demo.zip' to '/home/wcherlet/.cache/occpy_test_data/TLS_demo.zip.unzip'
Notebook data folder: /home/wcherlet/repos/occpy_tests/OccPy/data_notebooks/TLS_demo
OccPy configurations are saved in JSON files.
There are a few minimal requirements for a run:
- laz_in: input laz file or directory containing laz files. If given multiple las files, will assume multi-position TLS. If given single las file, will assume single-position TLS unless is_mobile is set.
- plot_dim: grid for occlusion mapping: [minX, minY, minZ, maxX, maxY, maxZ]
- vox_dim: voxel size - currently this applies to x, y, and z dimension. Non-cubic voxels are on the list of features to be added
Optional arguments:
- is_mobile: whether the acquisition is mobile (MLS/ULS) or static (TLS) (default: False)
- single_return: whether the data is single return or multi return data (default: False)
- out_dir: output directory (default: ./output)
- verbose: set logging level (default: False)
- debug: set logging level (default: False)
- lower_threshold: lower threshold above ground to exclude from occlusion mapping in voxels (default: 0)
- points_per_iter: number of points read in from laz file in each iteration (default: 10000000)
- delimiter: csv delimiter for scan position file (default: ",")
- root_folder: if given, will assume other paths are relative to this root folder and will prepend it to the paths (default: None)
- str_idxs_ScanPosID: string indices of where the scan position identifier is written in the laz file name. If not given, will use file name as ID (without extension) (default: None)
- output_voxels: whether to export .ply voxel grids (large files, slow) (default: False)
Additionally, we use the config file to save paths to other files, such as the positions file, DSM and DTM.
# Load config and resolve all relative paths once for downstream plotting/normalization calls
with open(config_file, 'r') as file:
config = json.load(file)
config
{'root_folder': '../..',
'laz_in': 'data_notebooks/TLS_demo/LAZ/',
'out_dir': 'output/TLS',
'vox_dim': 0.1,
'lower_threshold': 1,
'points_per_iter': 1000000,
'plot_dim': [2676515, 1246063, 545, 2676525, 1246113, 590],
'output_voxels': False,
'single_return': False,
'is_mobile': False,
'str_idxs_ScanPosID': [0, 10],
'debug': False,
'verbose': True,
'tif_in': {'DTM': 'data_notebooks/TLS_demo/Grids/Ramerenwald_DTM_20250305.tif',
'DSM': 'data_notebooks/TLS_demo/Grids/Ramerenwald_DSM_20250305.tif'},
'ScanPos': 'data_notebooks/TLS_demo/ScanPos/ScanPositions.txt'}
Running OccPy¶
First step is to create an OccPy object with the config file.
test = OccPy(config=config)
2026-04-02 15:14:29,278 - INFO - Prepending root folder /home/wcherlet/repos/occpy_tests/OccPy to input and output paths.
INFO: optional arguments: ['out_dir', 'output_voxels', 'verbose', 'debug', 'lower_threshold', 'points_per_iter', 'delimiter', 'root_folder', 'single_return', 'str_idxs_ScanPosID']
Then, define the sensor positions.
The linking between IDs in the sensor position file and the LAZ files can be controlled with str_idx_ScanPosID in the config. If this is not given, we use the scan filename (w/o extension) to try to link.
test.define_sensor_pos(path2file=config['ScanPos'], # Path to csv file with scanner positions
delimiter=',', # Delimiter used in the csv file
hdr_scanpos_id='#Name', # Header for the scan position ID [expected a substring or integer that should be linkable to the laz file - substring or number should be in the laz file name]
hdr_x='X', # Header for the x coordinate
hdr_y='Y', # Header for the y coordinate
hdr_z='Z', # Header for the z coordinate
sens_pos_id_offset=0 # very specific case, where there is an offset between scan position id in the file name and on the ID of the file. will only work if scanpos names are integer
)
2026-04-02 15:14:29,391 - INFO - Provided path to sensor position file data_notebooks/TLS_demo/ScanPos/ScanPositions.txt is not absolute. Resolving relative to root folder /home/wcherlet/repos/occpy_tests/OccPy. 2026-04-02 15:14:29,392 - INFO - Defining sensor positions for static acquisition.
Now we can start the raytracing and save the outputs.
Note: An alternative approach is shown in TLS_notebook_individualLAZ.ipynb. There OccPy is run separately for each scan, which could be useful for insights into individual scan contribution to the overall occlusion.
import time
tic = time.time()
test.do_raytracing()
toc = time.time()
print(f"Raytracing took {toc - tic:.2f} seconds.")
2026-04-02 15:14:29,403 - INFO - Linked 5 TLS LAZ files to scan positions.
############################### ##### Processing ScanPos033_sort_filtered.laz... ###############################
2026-04-02 15:14:30,684 - INFO - Do raytracing with stored pulses 2026-04-02 15:14:31,118 - INFO - Time elapsed for raytracing batch: 0.43 seconds
100% [====================]
2026-04-02 15:14:32,453 - INFO - Do raytracing with stored pulses 2026-04-02 15:14:33,096 - INFO - Time elapsed for raytracing batch: 0.64 seconds
100% [====================]
2026-04-02 15:14:34,500 - INFO - Do raytracing with stored pulses 2026-04-02 15:14:35,721 - INFO - Time elapsed for raytracing batch: 1.22 seconds
100% [====================]======== ]
2026-04-02 15:14:37,189 - INFO - Do raytracing with stored pulses 2026-04-02 15:14:38,224 - INFO - Time elapsed for raytracing batch: 1.03 seconds
100% [====================]
2026-04-02 15:14:39,366 - INFO - Do raytracing with stored pulses 2026-04-02 15:14:40,008 - INFO - Time elapsed for raytracing batch: 0.64 seconds
############################### ##### Processing ScanPos035_sort_filtered.laz... ############################### 100% [====================] ]
2026-04-02 15:14:41,625 - INFO - Do raytracing with stored pulses 2026-04-02 15:14:42,431 - INFO - Time elapsed for raytracing batch: 0.80 seconds
100% [====================] ]
2026-04-02 15:14:44,056 - INFO - Do raytracing with stored pulses 2026-04-02 15:14:45,016 - INFO - Time elapsed for raytracing batch: 0.96 seconds
100% [====================]
2026-04-02 15:14:46,636 - INFO - Do raytracing with stored pulses 2026-04-02 15:14:47,308 - INFO - Time elapsed for raytracing batch: 0.67 seconds
100% [====================]
2026-04-02 15:14:48,995 - INFO - Do raytracing with stored pulses 2026-04-02 15:14:49,474 - INFO - Time elapsed for raytracing batch: 0.48 seconds
100% [====================]
2026-04-02 15:14:50,838 - INFO - Do raytracing with stored pulses 2026-04-02 15:14:51,160 - INFO - Time elapsed for raytracing batch: 0.32 seconds
############################### ##### Processing ScanPos045_sort_filtered.laz... ###############################
2026-04-02 15:14:52,862 - INFO - Do raytracing with stored pulses 2026-04-02 15:14:53,420 - INFO - Time elapsed for raytracing batch: 0.56 seconds
100% [====================]
2026-04-02 15:14:55,081 - INFO - Do raytracing with stored pulses 2026-04-02 15:14:56,074 - INFO - Time elapsed for raytracing batch: 0.99 seconds
100% [====================]
2026-04-02 15:14:57,765 - INFO - Do raytracing with stored pulses 2026-04-02 15:14:58,711 - INFO - Time elapsed for raytracing batch: 0.95 seconds
100% [====================]
2026-04-02 15:15:00,396 - INFO - Do raytracing with stored pulses 2026-04-02 15:15:01,504 - INFO - Time elapsed for raytracing batch: 1.11 seconds
100% [====================] ]
2026-04-02 15:15:03,180 - INFO - Do raytracing with stored pulses 2026-04-02 15:15:03,852 - INFO - Time elapsed for raytracing batch: 0.67 seconds
100% [====================]
2026-04-02 15:15:04,825 - INFO - Do raytracing with stored pulses 2026-04-02 15:15:05,095 - INFO - Time elapsed for raytracing batch: 0.27 seconds
############################### ##### Processing ScanPos047_sort_filtered.laz... ###############################
2026-04-02 15:15:06,661 - INFO - Do raytracing with stored pulses 2026-04-02 15:15:07,968 - INFO - Time elapsed for raytracing batch: 1.31 seconds
100% [====================]
2026-04-02 15:15:09,659 - INFO - Do raytracing with stored pulses 2026-04-02 15:15:10,269 - INFO - Time elapsed for raytracing batch: 0.61 seconds
100% [====================]
2026-04-02 15:15:11,968 - INFO - Do raytracing with stored pulses 2026-04-02 15:15:12,502 - INFO - Time elapsed for raytracing batch: 0.53 seconds
100% [====================]
2026-04-02 15:15:14,243 - INFO - Do raytracing with stored pulses 2026-04-02 15:15:15,252 - INFO - Time elapsed for raytracing batch: 1.01 seconds
100% [====================]
2026-04-02 15:15:16,831 - INFO - Do raytracing with stored pulses 2026-04-02 15:15:17,763 - INFO - Time elapsed for raytracing batch: 0.93 seconds
############################### ##### Processing ScanPos061_sort_filtered.laz... ###############################
2026-04-02 15:15:19,524 - INFO - Do raytracing with stored pulses 2026-04-02 15:15:20,276 - INFO - Time elapsed for raytracing batch: 0.75 seconds
100% [====================] ]
2026-04-02 15:15:22,772 - INFO - Do raytracing with stored pulses 2026-04-02 15:15:24,468 - INFO - Time elapsed for raytracing batch: 1.69 seconds
100% [====================]
2026-04-02 15:15:27,801 - INFO - Do raytracing with stored pulses 2026-04-02 15:15:29,378 - INFO - Time elapsed for raytracing batch: 1.58 seconds
100% [====================]
2026-04-02 15:15:32,130 - INFO - Do raytracing with stored pulses 2026-04-02 15:15:34,351 - INFO - Time elapsed for raytracing batch: 2.22 seconds
100% [====================]
2026-04-02 15:15:37,208 - INFO - Do raytracing with stored pulses 2026-04-02 15:15:38,190 - INFO - Time elapsed for raytracing batch: 0.98 seconds
100% [====================]
2026-04-02 15:15:38,598 - INFO - convert incomplete pulses to complete ones - be cautious with that! 2026-04-02 15:15:40,794 - INFO - Run raytracing for incomplete pulses 2026-04-02 15:15:46,606 - INFO - Time elapsed for raytracing incomplete pulses: 5.77 seconds 2026-04-02 15:15:46,634 - INFO - Time elapsed for reading and raytracing entire data: 28.62 seconds 2026-04-02 15:15:46,636 - INFO - Extracting Nhit
#### 16948337 Pulses were traversed of possible 18056952 Pulses #### 8672978 Returns have been registered by the algorithm #### 14374613 Returns were found outside the voxel grid #### 6203 Returns were missed during the traversal! #### 1108615 Pulses did not intersect voxel grid!
2026-04-02 15:15:47,378 - INFO - Elapsed Time: 0.74 seconds 2026-04-02 15:15:47,379 - INFO - Extracting Nocc 2026-04-02 15:15:48,210 - INFO - Elapsed Time: 0.83 seconds 2026-04-02 15:15:48,212 - INFO - Extracting Nmiss 2026-04-02 15:15:49,063 - INFO - Elapsed Time: 0.85 seconds 2026-04-02 15:15:49,065 - INFO - Saving Occlusion Outputs As .npy 2026-04-02 15:15:49,173 - INFO - Elapsed Time: 0.11 seconds 2026-04-02 15:15:49,174 - INFO - Classify Grid 2026-04-02 15:15:50,171 - INFO - Elapsed Time: 0.9972751140594482 seconds
Raytracing took 80.78 seconds.
We can now height normalize the outputs using the provided DTM and DSM file. The DSM file is only to define the upper bounds of the canopy. Function 'normalize_occlusion_output' will also store the normalized voxel grids as npy files in the specified output folder
from occpy.util import normalize_occlusion_output
Nhit_norm, Nmiss_norm, Nocc_norm, Classification_norm, chm = normalize_occlusion_output(input_folder=os.path.join(config["root_folder"], config['out_dir']),
PlotDim=config['plot_dim'],
vox_dim=config['vox_dim'],
dtm_file=os.path.join(config["root_folder"], config['tif_in']['DTM']),
dsm_file=os.path.join(config["root_folder"], config['tif_in']['DSM']),
lower_threshold=config['lower_threshold'],
output_voxels=config['output_voxels'])
# If instead of a binary voxel grid classification, you are intereded in a fraction of occluded pulses per voxel, you can use the following function:
EPS = 1e-6 # small value to avoid division by zero
OcclFrac_norm = Nocc_norm.astype(float) / (Nhit_norm.astype(float) + Nmiss_norm.astype(float) + Nocc_norm.astype(float) + EPS)
Saving normalized output files into directory as .npy...
We can also visualize the results. Note that the amount of occlusion shown here is very high. This is due to the heavy filtering of the input data consisting of only 20% of the original data. The same figure based on the original data can be seen in the paper Kükenbrink et al. (under revision) [Preprint available here: https://doi.org/10.31223/X5N16X]
from occpy.visualization import get_Occl_TransectFigure_BinaryOcclusion, get_Occlusion_ProfileFigure
# first define figure properties
fig_prop = dict(fig_size=(3.5, 3.2), # figure size in inches
label_size=8, # label size in pts. for e.g. axis labels
label_size_ticks=6, # label size in pts. for e.g. axis ticks
label_size_tiny=5, # label size in pts. for e.g. ticks in colorbar
out_format='png',) # output format of the figure, can be 'png', 'pdf', 'svg', etc.
%matplotlib inline
get_Occl_TransectFigure_BinaryOcclusion(Nhit_norm, Classification_norm, plot_dim=config['plot_dim'], vox_dim=config['vox_dim'],
out_dir=os.path.join(config["root_folder"], config['out_dir']), axis=0, start_ind=0, end_ind=100, chm=chm, vertBuffer=10, fig_prop=fig_prop, show_plots=True)
# define other figure properties for the smaller figure
fig_prop = dict(fig_size=(1.75, 3.2),
label_size=8,
label_size_ticks=6,
label_size_tiny=5,
out_format='png', )
get_Occlusion_ProfileFigure(Classification_norm, plot_dim=config['plot_dim'], vox_dim=config['vox_dim'], out_dir=os.path.join(config["root_folder"], config['out_dir']), low_thresh=0, vertBuffer=10, max_percentage=100, fig_prop=fig_prop, show_plots=True)
The missing part in the middle of the transect close to top of canopy is unobserved volume. This is caused by the few selected Scan stations for the purpose of this tutorial, where all scan stations were setup as vertical scans with the Riegl scanner (no tilted scans). This can lead to unobserved volumes in the canopy. Please see Figure 4 in Kükenbrink et al. (under revision) [Preprint available here: https://doi.org/10.31223/X5N16X] to see how the occlusion map looks like when all scan positions were used.