% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/identify_usage_segments.R
\name{identify_usage_segments}
\alias{identify_usage_segments}
\title{Identify Usage Segments based on a metric}
\usage{
identify_usage_segments(
  data,
  metric = NULL,
  metric_str = NULL,
  version = "12w",
  threshold = NULL,
  width = NULL,
  max_window = NULL,
  power_thres = 15,
  return = "data"
)
}
\arguments{
\item{data}{A data frame with a Person query containing the metric to be
classified. The data frame must include a \code{PersonId} column and a
\code{MetricDate} column.}

\item{metric}{A string representing the name of the metric column to be
classified. This parameter is used when a single column represents the
metric.}

\item{metric_str}{A character vector representing the names of multiple
columns to be aggregated for calculating a target metric, using row sum for
aggregation. This is used when \code{metric} is not provided.}

\item{version}{A string indicating the version of the classification to be
used. Valid options are \code{"12w"} for a 12-week rolling average, \code{"4w"}
for a 4-week rolling average, or \code{NULL} when using custom parameters. Defaults to \code{"12w"}.}

\item{threshold}{Numeric value specifying the minimum number of times the
metric sum up to in order to be a valid count. A 'greater than or equal to'
logic is used. Only used when \code{version} is \code{NULL}.}

\item{width}{Integer specifying the number of qualifying counts to consider
for a habit. Only used when \code{version} is \code{NULL}.}

\item{max_window}{Integer specifying the maximum unit of dates to consider a
qualifying window for a habit. Only used when \code{version} is \code{NULL}.}

\item{power_thres}{Numeric value specifying the minimum weekly average
actions required to be classified as a 'Power User'. Defaults to 15.}

\item{return}{A string indicating what to return from the function. Valid
options are:
\itemize{
\item \code{"data"}: Returns the data frame with usage segments.
\item \code{"plot"}: Returns a plot of the usage segments.
\item \code{"table"}: Returns a summary table with usage segments as columns.
}}
}
\value{
Depending on the \code{return} parameter, either a data frame with usage
segments or a plot visualizing the segments over time. If \code{"data"} is passed
to \code{return}, the following additional columns are appended:
\itemize{
\item When \code{version} is \code{"12w"} or \code{"4w"}:
\itemize{
\item \code{IsHabit12w}: Indicates whether the user has a habit based on the 12-week
rolling average.
\item \code{IsHabit4w}: Indicates whether the user has a habit based on the 4-week
rolling average.
\item \code{UsageSegments_12w}: The usage segment classification based on the
12-week rolling average.
\item \code{UsageSegments_4w}: The usage segment classification based on the 4-week
rolling average.
}
\item When \code{version} is \code{NULL}:
\itemize{
\item \code{IsHabit}: Indicates whether the user has a habit based on the provided
parameters.
\item \code{UsageSegments}: The usage segment classification based on the provided
parameters.
}
\item \code{IsHabit12w}: Indicates whether the user has a habit based on the 12-week
rolling average.
\item \code{IsHabit4w}: Indicates whether the user has a habit based on the 4-week
rolling average.
\item \code{UsageSegments_12w}: The usage segment classification based on the
12-week rolling average.
\item \code{UsageSegments_4w}: The usage segment classification based on the 4-week
rolling average.
}

If \code{"table"} is passed to \code{return}, a summary table is returned with one row
per \code{MetricDate} and usage segments as columns containing percentages. The
table includes:
\itemize{
\item \code{MetricDate}: The date of the metric
\item Segment columns (in order): \code{Non-user}, \verb{Low User}, \verb{Novice User},
\verb{Habitual User}, \verb{Power User} (only segments present in the data are included)
\item \code{n}: The total number of distinct persons for that date
}

@import slider slide_dbl
@import tidyr
}
\description{
\ifelse{html}{\href{https://lifecycle.r-lib.org/articles/stages.html#experimental}{\figure{lifecycle-experimental.svg}{options: alt='[Experimental]'}}}{\strong{[Experimental]}}

This function identifies users into usage segments based on their usage
volume and consistency. The segments 'Power Users', 'Habitual Users', 'Novice
Users', 'Low Users', and 'Non-users' are created. There are two versions, one
based on a rolling 12-week average (\code{version = "12w"}) and the other on a
rolling 4-week average (\code{version = "4w"}). While a main use case is for
Copilot metrics e.g. 'Total_Copilot_actions', this function can be applied to
other metrics, such as 'Chats_sent'.
}
\details{
There are three ways to use this function for usage segments classification:
\enumerate{
\item \strong{12-week version} (\code{version = "12w"}): Based on a rolling 12-week period
\item \strong{4-week version} (\code{version = "4w"}): Based on a rolling 4-week period
\item \strong{Custom parameters} (\code{version = NULL}): Based on user-defined parameters
}

This function assumes that the input dataset is grouped at the weekly level
by the \code{MetricDate} column.

The definitions of the segments as per the 12-week definition are
as follows:
\itemize{
\item \strong{Power User}: Averaging 15+ weekly actions (customizable via \code{power_thres}) and any actions in at least
9 out of past 12 weeks
\item \strong{Habitual User}: Any action in at least 9 out of past 12 weeks
\item \strong{Novice User}: Averaging at least one action over the last 12 weeks
\item \strong{Low User}: Any action in the past 12 weeks
\item \strong{Non-user}: No actions in the past 12 weeks
}

The definitions of the segments as per the 4-week definition are
as follows:
\itemize{
\item \strong{Power User}: Averaging 15+ weekly actions (customizable via \code{power_thres}) and any actions in at least 4
out of past 4 weeks
\item \strong{Habitual User}: Any action in at least 4 out of past 4 weeks
\item \strong{Novice User}: Averaging at least one action over the last 4 weeks
\item \strong{Low User}: Any action in the past 4 weeks
\item \strong{Non-user}: No actions in the past 4 weeks
}

When using custom parameters (\code{version = NULL}), you must provide values for
\code{threshold}, \code{width}, \code{max_window}, and optionally \code{power_thres}. The segment definitions become:
\itemize{
\item \strong{Power User}: Minimum of \code{threshold} actions per week in at least \code{width}
out of past \code{max_window} weeks, with 15+ average weekly actions (customizable via \code{power_thres})
\item \strong{Habitual User}: Minimum of \code{threshold} actions per week in at least
\code{width} out of past \code{max_window} weeks
\item \strong{Novice User}: Average of at least one action over the last \code{max_window} weeks
\item \strong{Low User}: Any action in the past \code{max_window} weeks
\item \strong{Non-user}: No actions in the past \code{max_window} weeks
}
}
\examples{
# Example usage with a single metric column
identify_usage_segments(
  data = pq_data,
  metric = "Emails_sent",
  version = "12w",
  return = "plot"
)

# Example usage with multiple metric columns
identify_usage_segments(
  data = pq_data,
  metric_str = c(
    "Copilot_actions_taken_in_Teams",
    "Copilot_actions_taken_in_Outlook",
    "Copilot_actions_taken_in_Excel",
    "Copilot_actions_taken_in_Word",
    "Copilot_actions_taken_in_Powerpoint"
  ),
  version = "4w",
  return = "plot"
)

# Example usage with custom parameters
identify_usage_segments(
  data = pq_data,
  metric = "Emails_sent",
  version = NULL,
  threshold = 2,
  width = 5,
  max_window = 8,
  return = "plot"
)

# Example usage with custom power user threshold
identify_usage_segments(
  data = pq_data,
  metric = "Emails_sent",
  version = "12w",
  power_thres = 20,
  return = "plot"
)

# Return summary table
identify_usage_segments(
  data = pq_data,
  metric = "Emails_sent",
  version = "12w",
  return = "table"
)
}
