---
title: "Introduction to otTensor"
author:
- name: Koki Tsuyuzaki
  affiliation: Laboratory for Bioinformatics Research,
    RIKEN Center for Biosystems Dynamics Research
  email: k.t.the-answer@hotmail.co.jp
date: "`r Sys.Date()`"
bibliography: bibliography.bib
package: otTensor
output: rmarkdown::html_vignette
vignette: |
  %\VignetteIndexEntry{Introduction to otTensor}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE, fig.width = 6, fig.height = 4)
```

# What is Optimal Transport?

Imagine you have two piles of sand with different shapes, and you want to reshape one pile to match the other. **Optimal transport (OT)** finds the cheapest way to move the sand --- that is, the plan that minimizes the total cost of moving mass from one distribution to another.

In the simplest case, the two distributions are one-dimensional histograms. The "transport plan" is a matrix that tells you how much mass to move from each bin of the source to each bin of the target.

```{r ot-intuition, fig.height=5}
# Two simple 1D distributions
source_dist <- c(0.4, 0.1, 0.4, 0.1)
target_dist <- c(0.1, 0.3, 0.1, 0.3, 0.2)

oldpar <- par(mfrow = c(1, 2), mar = c(4, 4, 3, 1))
barplot(source_dist, col = "steelblue", main = "Source distribution",
        names.arg = seq_along(source_dist), ylim = c(0, 0.5),
        xlab = "Bin", ylab = "Mass")
barplot(target_dist, col = "tomato", main = "Target distribution",
        names.arg = seq_along(target_dist), ylim = c(0, 0.5),
        xlab = "Bin", ylab = "Mass")
par(mfrow = oldpar)
```

OT finds a transport plan (a matrix) that optimally maps the source to the target. Each cell of the matrix represents how much mass moves from a source bin to a target bin.

# What is a Tensor?

A **tensor** is simply a generalization of familiar data structures:

- **Order 1 (vector)**: a list of numbers, e.g., temperature readings over time
- **Order 2 (matrix)**: a table of numbers, e.g., pixels in a grayscale image
- **Order 3 and higher**: a "cube" or higher-dimensional array, e.g., a color image (height x width x RGB channels)

```{r tensor-illustration, fig.height=4}
oldpar <- par(mfrow = c(1, 3), mar = c(2, 2, 3, 1))

# Vector (order 1)
barplot(c(3, 1, 4, 1, 5), col = "steelblue", main = "Order 1: Vector")

# Matrix (order 2)
mat <- matrix(c(1, 2, 3, 4, 2, 3, 4, 5, 3, 4, 5, 6), nrow = 3)
image(mat, col = gray((0:255) / 255), axes = FALSE, main = "Order 2: Matrix")

# 3D tensor (show one slice)
arr <- array(0, dim = c(3, 4, 2))
arr[,,1] <- matrix(c(1, 2, 3, 4, 2, 3, 4, 5, 3, 4, 5, 6), nrow = 3)
arr[,,2] <- matrix(c(6, 5, 4, 3, 5, 4, 3, 2, 4, 3, 2, 1), nrow = 3)
image(arr[,,1], col = gray((0:255) / 255), axes = FALSE,
      main = "Order 3: Tensor\n(slice 1)")

par(mfrow = oldpar)
```

# The Problem OTT Solves

Standard OT works well for vectors and matrices, but what if your data is a higher-order tensor?

**Optimal Tensor Transport (OTT)** [@ott] extends OT to tensors of any order. Given two tensors $X$ and $Y$ of the same order, OTT finds transport plans --- one or more matrices that describe how to map each dimension of $X$ to the corresponding dimension of $Y$.

# The Key Concept: the `f` Parameter

The `f` parameter is the core idea that makes OTT flexible. It is a vector that assigns each dimension to a **transport plan group**. This controls how dimensions share transport plans.

| Setting | Meaning | Analogy |
|---------|---------|---------|
| `f = c(1, 2)` | Each dimension gets its own transport plan | Co-Optimal Transport |
| `f = c(1, 1)` | Both dimensions share the same plan | Gromov-Wasserstein-like |
| `f = c(1, 1, 2)` | Dims 1 & 2 share a plan; dim 3 has its own | GW collections |

For example, with a 3D tensor (e.g., subjects x genes x time):

- `f = c(1, 2, 3)` learns separate transport plans for subjects, genes, and time
- `f = c(1, 1, 2)` forces subjects and genes to share a plan, while time has its own

# Quick Start Example

Here we walk through a minimal example step by step.

```{r quickstart, message=FALSE}
library("otTensor")
library("rTensor")
```

## Step 1: Create two tensors

We create two small matrices (order-2 tensors) as source and target.

```{r create-tensors}
# Source: a 4 x 5 matrix
arrX <- matrix(0, nrow = 4, ncol = 5)
for (i in 1:4) {
    for (j in 1:5) {
        arrX[i, j] <- i + j
    }
}

# Target: a 6 x 7 matrix (different size is OK)
arrY <- matrix(0, nrow = 6, ncol = 7)
for (i in 1:6) {
    for (j in 1:7) {
        arrY[i, j] <- i + j
    }
}

# Convert to Tensor objects
X <- as.tensor(arrX)
Y <- as.tensor(arrY)
```

## Step 2: Choose the `f` parameter

Since this is an order-2 tensor with 2 dimensions, we set `f = c(1, 2)` so that each dimension gets its own transport plan.

```{r set-f}
f <- c(1, 2)
```

## Step 3: Run OTT

```{r run-ott}
result <- OTT(X = X, Y = Y, f = f,
              num.sample = 500, num.iter = 100)
```

## Step 4: Inspect the results

The result contains a list of transport plan matrices `Ts`. Since `f = c(1, 2)`, there are two plans:

- `Ts[[1]]`: maps rows of X (size 4) to rows of Y (size 6)
- `Ts[[2]]`: maps columns of X (size 5) to columns of Y (size 7)

```{r inspect-results}
# Transport plan dimensions
cat("Transport plan 1:", dim(result$Ts[[1]]), "\n")
cat("Transport plan 2:", dim(result$Ts[[2]]), "\n")
```

```{r visualize-results, fig.height=5, fig.width=6}
.show_matrix <- function(mat, main = "") {
    mat_rev <- t(apply(mat, 2, rev))
    image(mat_rev, col = gray((0:255) / 255),
          xaxt = "n", yaxt = "n",
          xlab = "", ylab = "", axes = FALSE, main = main)
}

oldpar <- par(mfrow = c(2, 2), mar = c(2, 2, 3, 1))
.show_matrix(arrX, main = "Source (X)")
.show_matrix(arrY, main = "Target (Y)")
.show_matrix(result$Ts[[1]], main = "Transport Plan 1\n(rows)")
.show_matrix(result$Ts[[2]], main = "Transport Plan 2\n(columns)")
par(mfrow = oldpar)
```

Each transport plan is a matrix where brighter cells indicate more mass being transported between the corresponding indices.

# Parameter Reference

| Parameter | Description | Default |
|-----------|-------------|---------|
| `X` | Source tensor (`rTensor::Tensor` object) | (required) |
| `Y` | Target tensor (same order as X, sizes may differ) | (required) |
| `f` | Integer vector assigning each dimension to a transport plan group | (required) |
| `ps` | List of source marginal distributions (one per unique value in `f`) | Uniform |
| `qs` | List of target marginal distributions | Uniform |
| `loss` | Loss function for computing costs | Absolute error |
| `num.sample` | Number of Monte Carlo samples for gradient estimation | 1000 |
| `num.iter` | Number of optimization iterations | 200 |
| `epsilon` | Convergence threshold | 1e-10 |

**Tips:**

- Larger `num.sample` gives more accurate gradients but is slower
- Start with smaller `num.iter` (e.g., 50) for exploration, increase for final results
- Custom loss functions can be passed, e.g., `loss = function(x, y) (x - y)^2` for squared error

# What's Next?

The next vignette (**otTensor-2: Optimal Tensor Transport**) reproduces the experiments from the original paper [@ott], demonstrating OTT under all six `f` configurations:

- OTT_1 --- standard OT (order-1 tensors)
- OTT_12 --- Co-OT (each dimension independent)
- OTT_11 --- Gromov-Wasserstein-like (shared plan)
- OTT_111 --- Triplets (order-3, single shared plan)
- OTT_123 --- tri-Co-OT (order-3, all independent)
- OTT_112 --- GW collections (mixed sharing)

# Session Information {.unnumbered}

```{r sessionInfo, echo=FALSE}
sessionInfo()
```

# References