% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/connect.R
\name{spod_connect}
\alias{spod_connect}
\title{Connect to data converted to \code{DuckDB} or hive-style \code{parquet} files}
\usage{
spod_connect(
  data_path,
  target_table_name = NULL,
  quiet = FALSE,
  max_mem_gb = NULL,
  max_n_cpu = max(1, parallelly::availableCores() - 1),
  temp_path = spod_get_temp_dir()
)
}
\arguments{
\item{data_path}{a path to the \code{DuckDB} database file with '.duckdb' extension, or a path to the folder with \code{parquet} files. Eigher one should have been created with the \link{spod_convert} function.}

\item{target_table_name}{Default is \code{NULL}. When connecting to a folder of \code{parquet} files, this argument is ignored. When connecting to a \code{DuckDB} database, a \code{character} vector of length 1 with the table name to open from the database file. If not specified, it will be guessed from the \code{data_path} argument and from table names that are available in the database. If you have not manually interfered with the database, this should be guessed automatically and you do not need to specify it.}

\item{quiet}{A \code{logical} value indicating whether to suppress messages. Default is \code{FALSE}.}

\item{max_mem_gb}{\code{integer} value of the maximum operating memory to use in GB. \code{NULL} by default, delegates the choice to the \code{DuckDB} engine which usually sets it to 80\% of available memory. Caution, in HPC use, the amount of memory available to your job may be determined incorrectly by the \code{DuckDB} engine, so it is recommended to set this parameter explicitly according to your job's memory limits.}

\item{max_n_cpu}{The maximum number of threads to use. Defaults to the number of available cores minus 1.}

\item{temp_path}{The path to the temp folder for DuckDB for \href{https://duckdb.org/2024/07/09/memory-management.html#intermediate-spilling}{intermediate spilling} in case the set memory limit and/or physical memory of the computer is too low to perform the query. By default this is set to the \code{temp} directory in the data folder defined by SPANISH_OD_DATA_DIR environment variable (set by \code{\link[=spod_set_data_dir]{spod_set_data_dir()}})). Otherwise, for queries on folders of CSV files or parquet files, the temporary path would be set to the current R working directory, which probably is undesirable, as the current working directory can be on a slow storage, or storage that may have limited space, compared to the data folder.}
}
\value{
a \code{DuckDB} table connection object.
}
\description{
\ifelse{html}{\href{https://lifecycle.r-lib.org/articles/stages.html#stable}{\figure{lifecycle-stable.svg}{options: alt='[Stable]'}}}{\strong{[Stable]}}

This function allows the user to quickly connect to the data converted to DuckDB with the \link{spod_convert} function. This function simplifies the connection process. The user is free to use the \code{DBI} and \code{DuckDB} packages to connect to the data manually, or to use the \code{arrow} package to connect to the \code{parquet} files folder.
}
\examples{
\dontshow{if (interactive()) withAutoprint(\{ # examplesIf}
\donttest{
# Set data dir for file downloads
spod_set_data_dir(tempdir())

# download and convert data
dates_1 <- c(start = "2020-02-17", end = "2020-02-18")
db_2 <- spod_convert(
 type = "number_of_trips",
 zones = "distr",
 dates = dates_1,
 overwrite = TRUE
)

# now connect to the converted data
my_od_data_2 <- spod_connect(db_2)

# disconnect from the database
spod_disconnect(my_od_data_2)
}
\dontshow{\}) # examplesIf}
}
