voiceGeneral tools for voice analysis.
The voice package is being developed to be an
easy-to-use set of tools to deal with audio analysis in R. It provides a
free and user-friendly toolkit for audio analysis, enabling researchers
to extract, tag, and analyze voice data efficiently. It supports the
extraction of audio features, enrichment of structured datasets with
audio summaries, and automatic identification of spoken segments—while
introducing novel features. It also allows audio analysis based on
musical theory, associating frequencies with musical notes arranged in a
score via gm
package.
The package has been tested extensively since 2019, including:
If you want to contribute, report bugs or request new features, use the ‘Issues’ tab on Github.
```{r, eval=FALSE} # Development version from GitHub install.packages(c(‘devtools’,‘tidyverse’)) devtools::install_github(‘filipezabala/voice’)
install.packages(‘voice’)
If you wish to perform a full installation, proceed to Section 4.
### 0.1 For Windows Users
If you're compiling R packages from source, you may need to install [RTools](https://cran.r-project.org/bin/windows/Rtools/), a collection of Windows-specific build tools for R.
### 0.2 For macOS Users
If you're compiling packages, ensure you have [Xcode Command Line Tools](https://mac.install.guide/commandlinetools/) installed. You also may need [macOS tools](https://cran.r-project.org/bin/macosx/tools/).
```{bash, eval=FALSE}
# Install Xcode on MacOS
xcode-select --install
More details may be found at https://filipezabala.com/voicegnette/.
```{r, message=FALSE, warning=FALSE} # packs library(voice) library(tidyverse)
wavDir <- list.files(system.file(‘extdata’, package = ‘wrassp’), pattern = glob2rx(’*.wav’), full.names = TRUE)
### 1.2 Extract features
```{r, message=FALSE, warning=FALSE}
# minimal usage
M <- voice::extract_features(wavDir)
glimpse(M)
```{r, message=FALSE, warning=FALSE} # creating Extended synthetic data E <- dplyr::tibble(subject_id = c(1,1,1,2,2,2,3,3,3), wav_path = wavDir) E
voice::tag(E)
voice::tag(E, groupBy = ‘subject_id’)
## 3. Visualization
### 3.1 Get audio
```{r, message=FALSE, warning=FALSE}
url0 <- 'https://github.com/filipezabala/voiceAudios/raw/refs/heads/main/wav/doremi.wav'
download.file(url0, paste0(tempdir(), '/doremi.wav'), mode = 'wb')
You may use the command voice::embed_audio(url0) if you
wish to show a play button when compiling an .Rmd file. See
https://github.com/mccarthy-m-g/embedr for more details about
embed_audio() related functions.
{r, message=FALSE, warning=FALSE} M <- voice::extract_features(tempdir()) summary(M)
{r, message=FALSE, warning=FALSE, fig.width=7.5, fig.height=4} voice::piano_plot(M, 0) # f0 voice::piano_plot(M, 0:1) # f0 + f1
{r, message=FALSE, warning=FALSE} (f0_spn <- voice::assign_notes(M, fmt = 0, min_points = 22, min_percentile = .85)) # f0 (f1_spn <- voice::assign_notes(M, fmt = 1, min_points = 22, min_percentile = .85)) # f1
{r, message=FALSE, warning=FALSE} library(gm) line_0 <- gm::Line(as.character(f0_spn)) m0 <- gm::Music() + gm::Meter(4, 4) + line_0 gm::show(m0, to = c('score', 'audio'))
{r, message=FALSE, warning=FALSE} line_0 <- gm::Line(as.character(f0_spn)) line_1 <- gm::Line(as.character(f1_spn)) m1 <- gm::Music() + gm::Meter(4, 4) + line_0 + line_1 gm::show(m1, to = c('score', 'audio'))
Python-based functions diarize and
extract_features (when the latter is inferring
f0_praat and fmt_praat features) require a
configured Python environment.
The following steps are used to fully configure voice on
Ubuntu 24.04 LTS (Noble Numbat). Reports of inconsistencies are
welcome.
Command line tool and library for transferring data with URLs.
# installing dependencies
sudo apt-get update
sudo apt-get install -y libssl-dev autoconf libtool make
# installing curl
sudo apt install curl
# verify installation
curl --versionffmpeg is a cross-platform solution to record, convert and stream audio and video.
sudo apt-get update
sudo apt-get install ffmpegsudo apt-get update
sudo apt-get install portaudio19-dev libasound2-dev libfontconfig1-dev libmagick++-dev libxml2-dev libharfbuzz-dev libfribidi-dev libgdal-dev cmake cmake-doc ninja-buildMuseScore is an open source notation software.
sudo add-apt-repository ppa:mscore-ubuntu/mscore-stable
sudo apt-get update
sudo apt-get install musescoreR is a free software environment for statistical computing and
graphics. To find out your Ubuntu distribution use
lsb_release -a at terminal.
sudo sh -c 'echo "deb https://cloud.r-project.org/bin/linux/ubuntu focal-cran40/" >> /etc/apt/sources.list.d/cran.list'
sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys E084DAB9
sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys 51716619E084DAB9
gpg -a --export E084DAB9 | sudo apt-key add -
sudo add-apt-repository ppa:c2d4u.team/c2d4u4.0+
sudo apt-get update && sudo apt-get upgrade
sudo apt-get install r-base r-base-devRStudio is an Integrated Development Environment (IDE) for R. Check for updates here.
sudo apt-get update
sudo apt-get install gdebi-core
wget https://download1.rstudio.org/electron/jammy/amd64/rstudio-2025.05.0-496-amd64.deb
sudo gdebi rstudio-2025.05.0-496-amd64.deb“Packages are the fundamental units of reproducible R code.” Hadley Wickham and Jennifer Bryan. The installation may take several minutes. At terminal run:
sudo RRunning R as super user paste the following, row by row:
packs <- c('audio','reticulate','R.utils','seewave','tidyverse','tuneR','wrassp')
install.packages(packs, dep = TRUE)
update.packages(ask = FALSE)
devtools::install_github('egenn/music')
devtools::install_github('flujoo/gm')To configure the gm package.
usethis::edit_r_environ()Add the line MUSESCORE_PATH=/usr/bin/mscore to
/root/.Renviron file. To exit use :wq at VI.
Save and restart the R/RStudio session.
Miniconda is a free minimal installer for conda, an open source package,
dependency and environment management system for any language—Python, R,
Ruby, Lua, Scala, Java, JavaScript, C/ C++, FORTRAN and more, that runs
on Windows, macOS and Linux.
Follow the instructions at
https://docs.conda.io/en/latest/miniconda.html.
At terminal:
cd ~/Downloads/
wget -r -np -k https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
cd repo.anaconda.com/miniconda/
bash Miniconda3-latest-Linux-x86_64.shDo you accept the license terms? [yes|no] yes.
Miniconda3 will now be installed into this location: /home/user/miniconda3 [ENTER]
You can undo this by running
conda init --reverse $SHELL? yes
Do you wish the installer to initialize Miniconda3 by running conda
init? yes.
Close and reopen terminal.
conda update -n base -c defaults condaThe following packages will be INSTALLED/REMOVED/UPDATED/DOWNGRADED:…
Proceed ([y]/n)? y
conda create -n pyvoice python=3.12The following (NEW) packages will be downloaded/INSTALLED:… Proceed
([y]/n)? y
conda activate pyvoice
pip install -r https://raw.githubusercontent.com/filipezabala/voice/master/requirements.txtThe following steps are used to fully configure voice on
MacOS Sonoma and Tahoe.
Reports of inconsistencies are welcome.
Install Homebrew, ‘The Missing Package Manager for macOS (or Linux)’
and remember to brew doctor eventually. At terminal
(command + space 'terminal') run:
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"GNU Wget is a free software package for retrieving files using HTTP, HTTPS, FTP and FTPS, the most widely used Internet protocols. It is a non-interactive commandline tool, so it may easily be called from scripts, cron jobs, terminals without X-Windows support, etc.
brew install wgetPython is a programming language that integrate systems. According to this post, it is recommended to install Python 3.8 and 3.9 and make it consistent.
brew install python@3.12
python3 --version
pip3 --versionffmpeg is a cross-platform solution to record, convert and stream audio and video. The installation may take several minutes.
brew install ffmpegThe XQuartz project is an open-source effort to develop a version of the X.Org X Window System that runs on macOS.
Follow the instructions from https://guide.macports.org/chunked/installing.macports.html.
sudo port selfupdate && sudo port upgrade tcllib
sudo port install tcllibMuseScore is an open source notation software.
R is a free software environment for statistical computing and graphics.
RStudio is an Integrated Development Environment (IDE) for R.
command + space 'rstudio'“Packages are the fundamental units of reproducible R code.” Hadley Wickham and Jennifer Bryan. Type
command + space 'terminal'
sudo RRunning R as super user paste the following, one line at a time.
packs <- c('audio','reticulate','R.utils','seewave','tidyverse','tuneR','wrassp')
install.packages(packs, dep = TRUE)
update.packages(ask = FALSE)
devtools::install_github('egenn/music')
devtools::install_github('flujoo/gm')Miniconda is a free minimal installer for conda, an open source package, dependency and environment management system for any language—Python, R, Ruby, Lua, Scala, Java, JavaScript, C/ C++, FORTRAN and more, that runs on Windows, macOS and Linux.
For 64-bit version use
cd ~/Downloads
wget -r -np -k https://repo.anaconda.com/miniconda/Miniconda3-latest-MacOSX-x86_64.sh
cd repo.anaconda.com/miniconda/
bash Miniconda3-latest-MacOSX-x86_64.shFor M1 version use
cd ~/Downloads
wget -r -np -k https://repo.anaconda.com/miniconda/Miniconda3-latest-MacOSX-arm64.sh
cd repo.anaconda.com/miniconda/
bash Miniconda3-latest-MacOSX-arm64.shIn order to continue the installation process, please review the
license agreement. Please, press ENTER to continue
ENTER.
You can undo this by running
conda init --reverse $SHELL? yes
Close and reopen terminal.
export PATH="~/miniconda3/bin:$PATH"
conda update -n base -c defaults condaThe following packages will be INSTALLED/REMOVED/UPDATED/DOWNGRADED:…
Proceed ([y]/n)? y
conda create -n pyvoice python=3.12The following (NEW) packages will be downloaded/INSTALLED:… Proceed
([y]/n)? y
Close and reopen terminal.
conda activate base
conda activate pyvoice
pip install -r https://raw.githubusercontent.com/filipezabala/voice/master/requirements.txt# download
url0 <- 'https://github.com/filipezabala/voiceAudios/raw/main/wav/sherlock0.wav'
wavDir <- normalizePath(tempdir())
download.file(url0, paste0(wavDir, '/sherlock0.wav'), mode = 'wb')
Diarization can be performed to detect speaker segments (i.e., ‘who spoke when’).
# diarize
voice::diarize(fromWav = wavDir, toRttm = wavDir, token = 'YOUR_TOKEN')
The voice::diarize() function creates Rich Transcription
Time Marked (RTTM)1 files, space-delimited text files
containing one turn per line defined by NIST - National Institute of
Standards and Technology. The RTTM files can be read using
voice::read_rttm().
# read_rttm
(rttm <- voice::read_rttm(wavDir))
Finally, the audio waves can be automatically segmented.
# split audio wave
voice::splitw(fromWav = wavDir, fromRttm = wavDir, to = wavDir)
dir(wavDir, pattern = '.[Ww][Aa][Vv]$')
See Appendix C at https://www.nist.gov/system/files/documents/itl/iad/mig/KWS15-evalplan-v05.pdf.↩︎