---
title: "Coverage and topology"
author: "Gilles Colling"
date: "`r Sys.Date()`"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Coverage and topology}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r setup, include = FALSE}
has_sf <- requireNamespace("sf", quietly = TRUE)
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  eval = has_sf
)
```

The per-feature verbs in
[Streaming spatial operations](streaming-spatial.html) treat each geometry on
its own. A second family of verbs reads a whole feature set at once: the
constructions and topology operations that need every feature in a group
together, not one row at a time. These ride a shared partition tier. A `by`
argument routes the layer into one shard per group, each shard is processed as
an independent coverage, and the shards stream so the resident budget is one
group rather than the whole layer.

This vignette covers building polygons from lines, cleaning a polygon coverage,
decomposing geometry into its parts, set-wise constructions, and linear
referencing. Every block runs on the optional
[sf](https://r-spatial.github.io/sf/) package.

```{r libs, message = FALSE}
library(vectra)
library(sf)
```

## From lines to polygons

`spatial_polygonize()` builds the polygonal faces enclosed by a line network,
the inverse of taking polygon boundaries. A group's lines are unioned and noded,
then the faces of that arrangement are returned, one per row.

```{r polygonize}
grid <- st_sfc(
  st_linestring(rbind(c(0, 0), c(2, 0))),
  st_linestring(rbind(c(0, 1), c(2, 1))),
  st_linestring(rbind(c(0, 2), c(2, 2))),
  st_linestring(rbind(c(0, 0), c(0, 2))),
  st_linestring(rbind(c(1, 0), c(1, 2))),
  st_linestring(rbind(c(2, 0), c(2, 2))))

f <- tempfile(fileext = ".vtr")
write_vtr(data.frame(geometry = st_as_binary(grid, hex = TRUE)), f)

tbl(f) |> spatial_polygonize() |> collect_sf()
```

`spatial_line_merge()` is the line counterpart of a dissolve: it sews segments
that meet end to end into maximal linestrings, one row per chain. Segments
meeting at a junction of degree greater than two stay separate.

```{r line-merge}
seg <- st_sfc(
  st_linestring(rbind(c(0, 0), c(1, 0))),
  st_linestring(rbind(c(1, 0), c(2, 0))),
  st_linestring(rbind(c(2, 0), c(3, 0))))

g <- tempfile(fileext = ".vtr")
write_vtr(data.frame(geometry = st_as_binary(seg, hex = TRUE)), g)

tbl(g) |> spatial_line_merge() |> collect_sf()
```

## Cleaning a coverage

Real coverages arrive with jittered shared borders, sub-pixel slivers, and more
detail than a map needs. Three verbs clean them while keeping adjacent polygons
edge-matched.

`spatial_snap_grid()` rounds coordinates to a regular grid and repairs the
result, one batch at a time. It is the fixed-precision snap-rounding that
`spatial_overlay()` applies internally, exposed as a standalone verb, so a layer
can be pre-noded to a common precision without running a full overlay.

```{r snap-grid}
p <- st_polygon(list(rbind(c(0.04, 0.03), c(1.02, 0.01),
                           c(0.98, 1.03), c(0.01, 0.97), c(0.04, 0.03))))
h <- tempfile(fileext = ".vtr")
write_vtr(data.frame(id = 1L, geometry = st_as_binary(st_sfc(p), hex = TRUE)), h)

tbl(h) |> spatial_snap_grid(0.1) |> collect_sf()
```

`spatial_snap()` instead pulls a streamed layer onto a resident reference layer:
vertices and edges within `tolerance` of `y` are moved onto it, so two layers
that should share a boundary line up exactly.

```{r snap}
ref  <- st_sfc(st_linestring(rbind(c(0, 0), c(10, 0))))
line <- st_linestring(rbind(c(0, 0.2), c(5, 0.1), c(10, 0.2)))

sn <- tempfile(fileext = ".vtr")
write_vtr(data.frame(id = 1L, geometry = st_as_binary(st_sfc(line), hex = TRUE)), sn)

tbl(sn) |> spatial_snap(ref, tolerance = 0.5) |> collect_sf()
```

`spatial_eliminate()` cleans a polygon coverage by absorbing every feature
smaller than `max_area` into a neighbour, the QGIS "Eliminate". Each sliver joins
the neighbour it shares the longest border with, or the largest-area neighbour
with `into = "largest_area"`. An area-rooted union-find collapses chains of
slivers so a connected run flows to its single largest member, whose attributes
survive.

```{r eliminate}
big    <- st_polygon(list(rbind(
  c(0, 0), c(10, 0), c(10, 10), c(0, 10), c(0, 0))))
sliver <- st_polygon(list(rbind(
  c(10, 0), c(10.3, 0), c(10.3, 10), c(10, 10), c(10, 0))))

e <- tempfile(fileext = ".vtr")
write_vtr(data.frame(
  id = c("keep", "sliver"),
  geometry = st_as_binary(st_sfc(big, sliver), hex = TRUE)), e)

tbl(e) |> spatial_eliminate(max_area = 5) |> collect_sf()
```

The thin sliver is gone, absorbed into the square it bordered, and the survivor
keeps the `keep` attributes.

`spatial_simplify()` simplifies a polygon coverage without tearing shared edges.
Boundaries are unioned so a shared border is one line, noded into arcs, each arc
simplified once with its junction endpoints pinned, then re-polygonized. Adjacent
polygons stay edge-matched with no slivers. This is the topology-preserving
simplification a per-feature `spatial_map(~ sf::st_simplify())` cannot give,
because that simplifies each polygon's copy of a shared border independently.

```{r simplify}
p1 <- st_polygon(list(rbind(
  c(0, 0), c(1, 0), c(1, 0.5), c(1, 1), c(0, 1), c(0, 0))))
p2 <- st_polygon(list(rbind(
  c(1, 0), c(2, 0), c(2, 1), c(1, 1), c(1, 0.5), c(1, 0))))

s <- tempfile(fileext = ".vtr")
write_vtr(data.frame(
  id = c("a", "b"),
  geometry = st_as_binary(st_sfc(p1, p2), hex = TRUE)), s)

tbl(s) |> spatial_simplify(tolerance = 0.6) |> collect_sf()
```

## Decomposing geometry

`spatial_explode()` splits every multipart geometry into its single-part
components, a `MULTIPOLYGON` into one row per polygon and so on, copying the
source attributes onto each part. An optional `part` argument names a 1-based
part-index column. It is the streaming "multipart to singleparts" tool and the
inverse of `spatial_dissolve()`.

```{r explode}
mp <- st_multipolygon(list(
  list(rbind(c(0, 0), c(1, 0), c(1, 1), c(0, 1), c(0, 0))),
  list(rbind(c(2, 2), c(3, 2), c(3, 3), c(2, 3), c(2, 2)))))

m <- tempfile(fileext = ".vtr")
write_vtr(data.frame(
  id = 1L, geometry = st_as_binary(st_sfc(mp), hex = TRUE)), m)

tbl(m) |> spatial_explode(part = "part_id") |> collect_sf()
```

`spatial_topology()` decomposes a polygon coverage into the arcs of its planar
topology. The unioned boundaries are noded so a shared border becomes one arc,
tagged with the identifiers of the (up to two) polygons on either side: two for
an internal shared edge, one for an outer edge. It is the inverse of
`spatial_polygonize()`.

```{r topology}
q1 <- st_polygon(list(rbind(c(0, 0), c(1, 0), c(1, 1), c(0, 1), c(0, 0))))
q2 <- st_polygon(list(rbind(c(1, 0), c(2, 0), c(2, 1), c(1, 1), c(1, 0))))

tp <- tempfile(fileext = ".vtr")
write_vtr(data.frame(
  id = c("a", "b"),
  geometry = st_as_binary(st_sfc(q1, q2), hex = TRUE)), tp)

tbl(tp) |> spatial_topology(id = "id") |> collect()
```

The shared edge between `a` and `b` appears once, with both face columns filled;
each outer edge appears once with the second face `NA`.

`spatial_centerline()` traces the centerline (medial axis) of each polygon from
the Voronoi diagram of its densified boundary: the Voronoi edges that fall inside
the polygon are its skeleton, merged into maximal lines. `density` sets the
boundary sampling and `prune` drops the short spurs the skeleton grows toward
convex corners. It is the usual approximation for a river or road centerline from
a filled shape; non-polygon geometry passes through unchanged.

```{r centerline}
strip <- st_polygon(list(rbind(
  c(0, 0), c(10, 0), c(10, 2), c(0, 2), c(0, 0))))

cl <- tempfile(fileext = ".vtr")
write_vtr(data.frame(geometry = st_as_binary(st_sfc(strip), hex = TRUE)), cl)

tbl(cl) |> spatial_centerline(density = 0.25, prune = 0.5) |> collect_sf()
```

## Set-wise constructions

`spatial_construct()` builds a geometry construction from a whole feature set,
the constructions a per-feature `spatial_map()` cannot express. A `kind`
argument selects it: `"convex_hull"`, `"concave_hull"`, `"envelope"`,
`"oriented_box"`, `"enclosing_circle"`, `"inscribed_circle"`, `"pole"` (the pole
of inaccessibility, the centre of the maximum inscribed circle), `"voronoi"`,
and `"delaunay"`. A `by` argument builds one construction per group; the
enclosing kinds emit one feature per group, the tessellations one polygon per
cell.

```{r construct-hull}
nc <- st_read(system.file("shape/nc.shp", package = "sf"), quiet = TRUE)
nc$band <- nc$SID74 > 5

n <- tempfile(fileext = ".vtr")
write_vtr(data.frame(
  band = nc$band,
  geometry = st_as_binary(st_geometry(nc), hex = TRUE)), n)

tbl(n) |>
  spatial_construct("convex_hull", by = "band", crs = st_crs(nc)) |>
  collect_sf()
```

A tessellation kind returns one polygon per cell, each carrying the group's `by`
values:

```{r construct-voronoi}
xy <- rbind(c(0, 0), c(1, 0), c(1, 1), c(0, 1), c(0.4, 0.6))
cloud <- st_sfc(lapply(seq_len(nrow(xy)), function(i) st_point(xy[i, ])))

v <- tempfile(fileext = ".vtr")
write_vtr(data.frame(geometry = st_as_binary(cloud, hex = TRUE)), v)

tbl(v) |> spatial_construct("voronoi") |> collect_sf()
```

## Linear referencing

`spatial_locate()` places streamed points along a resident line layer: each
point gets its nearest line's identifier, the measure (distance along that line
to the point's projection), and the perpendicular offset, with an optional `snap`
onto the line. The inverse, a measure back to a point, is
`sf::st_line_interpolate()` through `spatial_map()`.

```{r locate}
roads <- st_sf(road = c("main", "side"), geometry = st_sfc(
  st_linestring(rbind(c(0, 0), c(10, 0))),
  st_linestring(rbind(c(0, 5), c(0, 15)))))

pts <- data.frame(id = 1:2, x = c(3, 1), y = c(1, 9))
l <- tempfile(fileext = ".vtr")
write_vtr(pts, l)

tbl(l) |>
  spatial_locate(roads, coords = c("x", "y"), y_id = "road") |>
  collect()
```

Point 1 is one unit off `main` at measure 3; point 2 is one unit off `side` at
measure 4. With `snap = TRUE` each point's geometry is replaced by its
projection onto the matched line.

```{r cleanup, include = FALSE}
unlink(c(f, g, h, sn, e, s, m, tp, cl, n, v, l))
```
