EVERY TREE, COUNTED — 3D SEGMENTATION FROM A CAMERA DRONE

TIM & SHUBHAM
COOLANT TEAM
3D instance segmentation of a pine forest — each tree colored by its predicted instance label

A 651-tree validation study across 16 stands.

Knowing how many trees you have and where each one starts and ends is the foundation of every inventory decision. Stocking density, thinning prescriptions, growth projections — all of it starts with a count. Manual delineation doesn't scale. Automated methods have historically required LiDAR.

We took the same Gaussian Splat reconstruction described in our height validation — a standard camera drone, no LiDAR — and ran it through a transformer-based segmentation pipeline. Every splat receives two labels: a semantic class (ground, trunk, foliage, low vegetation) and an instance ID identifying which tree it belongs to.

651 trees. 8 of 16 stands with a perfect count. F1 of 0.991 on canopy trees. Near-perfect semantic segmentation at 0.9775 mIoU. Almost every error is an undercount — the model doesn't hallucinate trees, it occasionally fails to separate them.

Loading 3D reconstruction

The Setup

Input. The same Gaussian Splat reconstruction from our height study. A DJI Matrice 4E flown at 90 m AGL with 80/80 overlap — the same camera drone and flight plan you'd use for any mapping job.

Pipeline. The splat is preprocessed, passed through a transformer-based architecture, and postprocessed. Two outputs per splat: a semantic class label and an instance ID.

Validation. Hand-collected ground truth from our Forestry and UAS Specialist in the field. Every tree walked, identified, and counted across 16 stands.

Semantic Segmentation

The semantic output assigns every splat to one of five classes: ground, trunk, tree foliage, low vegetation, and unclassified. The binary separation between tree and non-tree is near-perfect: 0.9775 mIoU on out-of-sample stands.

The five-class mIoU of 0.7936 is pulled down by boundary ambiguity between low vegetation and ground — the operationally important classes (trunk, foliage, ground) are tight. What matters visually: trunks are cleanly isolated from foliage and ground. That separation is the prerequisite for diameter measurement.

Semantic render — sparse stand
Semantic render — dense stand

Instance Segmentation — The Count

The simplest, most auditable test of instance segmentation: did you get the right number of trees?

Half the stands are exact. 8 of 16 stands returned the correct tree count. Only 6 overcounts across all stands — nearly every error is a missed tree, not a hallucinated one.

Predicted vs. Ground Truth — Per Stand

01020304050607080SS Plot 1SS Plot 2-2SS Plot 3-1SS Plot 4-1SS Plot 5SS Plot 6-2SS Plot 7144742 P4144824 P2144824 P3144824 P8-3145696 P1-4145869 P3-8145869 P4145869 P5-2145869 P9Ground TruthPredicted (exact)Predicted (off)Out-of-Sample

The chart above shows every plot. Green bars are exact matches. The single visible outlier is 145869 Plot 4 — a densely stocked site with 71 ground-truth trees where 10 pairs of adjacent trees were counted as one. That failure mode dominates the error budget.

On canopy trees, the F1 is 0.991. On all trees above 5 m including subcanopy, F1 is 0.973. The two out-of-sample stands (85 trees the model never trained on) score an F1 of 0.966 — with one of those stands returning a perfect count.

Where It Breaks

Almost every error is an undercount. Across all 16 stands the model predicted 628 trees against 651 ground truth — a net deficit of 23 trees. Only 6 overcounts in the entire dataset.

One failure mode dominates: two adjacent trees predicted as a single instance. Of the 30 missed trees across all stands, 29 involved close-together crowns that the model failed to separate. This is most pronounced in densely stocked stands — 145869 Plot 4 alone accounts for 10 of those cases.

The remaining misses are a mix: small, thin subcanopy stems that the model sees but doesn't isolate; occasional hardwoods in predominantly pine stands; and rare reconstruction artifacts where the underlying splat is too degraded to segment.

Canopy-level errors (trees large enough to be merchantable timber) are rare: 12 across all 651 trees, concentrated in the two densest stands. For the 14 other stands, canopy tree counts are either perfect or off by one.

The dominant error — merged adjacent trees — is what we're actively working to improve. Every other failure mode is either infrequent or confined to trees below merchantable size.

Summary

Trees Validated651628 predicted
Perfect Stands8 / 16Exact count match
Canopy F10.991
All Trees F10.973>5 m

What This Means

Combined with the per-tree heights we published last month, each tree now has an identity, a location, and a height — all from a single camera-drone flight. The building blocks of a per-tree inventory.

Count accuracy at the stand level is tighter than the per-tree numbers suggest. The slight undercount bias (the model misses more than it hallucinates) means stems-per-hectare tracks ground truth closely, and the errors that remain are concentrated in subcanopy and dense adjacency — conditions where even field crews disagree.

When per-flight cost drops by two orders of magnitude, you stop inventorying stands once every five years and start tracking individual trees over time. Growth, mortality, storm damage, thinning response — tree by tree, flight by flight.

What's Next

The same 3D reconstruction that produces these instance labels and heights contains the geometry needed for diameter, taper, and crown structure — and we're working toward extracting all of it.

If you manage timberland or run a consulting forestry practice, we'd like to show you what this looks like on your stands.

michael@coolant.earth · Schedule a call