| Package | Command | What it does | First introduced |
|---|---|---|---|
| base R | $ |
Accesses or creates columns within a data frame. | Introduction to R |
| base R | : |
Creates integer sequences (e.g., 1:10). |
Introduction to R |
| base R | <- |
Assigns values to objects for later use. | Introduction to R |
| base R | ?function_name |
Accesses built-in help documentation for a function. | Introduction to R |
| base R | [] |
Indexes and subsets elements from vectors or data frames. | Introduction to R |
| base R | c() |
Combines multiple values into a single vector. | Introduction to R |
| base R | class() |
Identifies the data type (class) of an object. | Introduction to R |
| base R | colnames() |
Displays or modifies column names of a data frame. | Introduction to R |
| base R | data.frame() |
Combines vectors into a tabular data structure. | Introduction to R |
| base R | factor() |
Converts character data into categorical (factor) variables. | Introduction to R |
| base R | getwd() / setwd() |
Gets or sets the current working directory. | Introduction to R |
| base R | head() / tail() |
Displays the first or last rows of a dataset. | Introduction to R |
| base R | install.packages() |
Installs packages from CRAN. | Introduction to R |
| base R | length() |
Returns the number of elements in a vector. | Introduction to R |
| base R | levels() |
Displays the levels associated with a factor. | Introduction to R |
| base R | library() |
Loads an installed package into the current R session. | Introduction to R |
| base R | log() / log10() |
Computes natural and base-10 logarithms. | Introduction to R |
| base R | nrow() / ncol() |
Returns the number of rows or columns in a dataset. | Introduction to R |
| base R | read.csv() |
Imports CSV files into R as data frames. | Introduction to R |
| base R | round() |
Rounds numeric values to a specified number of digits. | Introduction to R |
| base R | sqrt() |
Computes square roots. | Introduction to R |
| base R | str() |
Displays the internal structure and data types of a dataset. | Introduction to R |
| base R | summary() |
Produces descriptive summaries of variables or model results. | Introduction to R |
| base R | table() |
Creates frequency tables for categorical data. | Introduction to R |
| base R | is.na() |
Identifies missing (NA) values. |
Introduction to tidyverse |
| base R | sessionInfo() |
Displays information about the current R session, including loaded packages. | Introduction to tidyverse |
| base R | |> |
Passes the result of one operation into the next. | Introduction to tidyverse |
| dplyr | arrange() |
Orders rows based on column values. | Introduction to tidyverse |
| dplyr | case_when() |
Applies multiple conditional rules. | Introduction to tidyverse |
| dplyr | count() |
Counts observations by group. | Introduction to tidyverse |
| dplyr | distinct() |
Returns unique rows or value combinations. | Introduction to tidyverse |
| dplyr | filter() |
Keeps rows that meet logical conditions. | Introduction to tidyverse |
| dplyr | group_by() |
Groups data for grouped operations. | Introduction to tidyverse |
| dplyr | if_else() |
Creates values based on a binary condition. | Introduction to tidyverse |
| dplyr | mutate() |
Creates or modifies columns. | Introduction to tidyverse |
| dplyr | n() |
Returns group size within summarise(). |
Introduction to tidyverse |
| dplyr | rename() |
Renames columns using new_name = old_name. |
Introduction to tidyverse |
| dplyr | select() |
Chooses specific columns from a dataset. | Introduction to tidyverse |
| dplyr | summarise() |
Computes summary statistics for groups. | Introduction to tidyverse |
| magrittr | %>% |
Passes the result of one operation into the next. | Introduction to tidyverse |
| nycOpenData | nyc_311() |
Downloads NYC 311 Service Request data from NYC Open Data. | Introduction To tidyverse |
| tidyr | drop_na() |
Removes rows containing missing values. | Introduction to tidyverse |
| base R | mean() |
Calculates the average of numeric values. | Comparing Two Groups |
| base R | merge() |
Joins two data frames together based on a shared key variable. | Comparing Two Groups |
| base R | rbind() |
Combines multiple data frames by binding rows together. | Comparing Two Groups |
| dplyr | inner_join() |
Performs a SQL-style inner join, keeping only rows that match in both datasets. | Comparing Two Groups |
| stats | t.test() |
Tests whether two group means differ significantly. | Comparing Two Groups |
| tidyr | pivot_longer() |
Converts data from wide format to long format. | Comparing Two Groups |
| tidyr | pivot_wider() |
Converts data from long format back to wide format. | Comparing Two Groups |
| base R | plot() |
Visualizes post-hoc comparison results. | Comparing Multiple Means |
| base R | set.seed() |
Ensures reproducibility when generating random data. | Comparing Multiple Means |
| stats | TukeyHSD() |
Performs post-hoc pairwise comparisons. | Comparing Multiple Means |
| stats | aov() |
Fits ANOVA models. | Comparing Multiple Means |
| stats | rnorm() |
Generates random values from a normal distribution. | Comparing Multiple Means |
| supernova | supernova() |
Displays ANOVA results in structured tables. | Comparing Multiple Means |
| base R | xtabs() |
Constructs contingency tables using a formula interface. | Analyzing Categorical Data |
| gmodels | CrossTable() |
Detailed cross-tabulations. | Analyzing Categorical Data |
| janitor | adorn_ns() |
Displays counts and percentages together. | Analyzing Categorical Data |
| janitor | adorn_percentages() |
Converts counts to percentages. | Analyzing Categorical Data |
| janitor | tabyl() |
Creates clean contingency tables. | Analyzing Categorical Data |
| pheatmap | pheatmap() |
Heatmap visualization of residuals or contributions. | Analyzing Categorical Data |
| rcompanion | cramerV() |
Measures association strength between categorical variables. | Analyzing Categorical Data |
| stats | chisq.test() |
Performs Chi-Square tests of independence. | Analyzing Categorical Data |
| GGally | ggpairs() |
Enhanced scatterplot matrix with correlations. | Correlation Analysis |
| base R | ifelse() |
Recodes variables conditionally. | Correlation Analysis |
| base R | pairs() |
Creates a scatterplot matrix. | Correlation Analysis |
| janitor | clean_names() |
Cleans names of an object. | Correlations |
| ppcor | pcor.test() |
Computes partial correlations. | Correlation Analysis |
| stats | cor() |
Computes Pearson correlation coefficients. | Correlation Analysis |
| stats | cor.test() |
Computes and tests correlations. | Correlation Analysis |
| broom | glance() |
Extracts model-level statistics. | Linear Regression |
| broom | tidy() |
Tidies model coefficients. | Linear Regression |
| lmtest | bptest() |
Tests heteroscedasticity. | Linear Regression |
| stats | AIC() |
Compares regression models. | Linear Regression |
| stats | lm() |
Fits linear regression models. | Linear Regression |
| stats | step() |
Performs stepwise model selection. | Linear Regression |
| base R | exp() |
Converts log-odds to odds ratios. | Logistic Regression |
| caTools | sample.split() |
Splits data into training/testing sets. | Logistic Regression |
| car | vif() |
Detects multicollinearity. | Logistic Regression |
| caret | confusionMatrix() |
Evaluates classification performance. | Logistic Regression |
| caret | varImp() |
Assesses predictor importance. | Logistic Regression |
| pROC | auc() |
Computes area under the ROC curve (AUC). | Logistic Regression |
| pROC | roc() |
Builds ROC curves. | Logistic Regression |
| pscl | pR2() |
Computes pseudo R² values. | Logistic Regression |
| stats | glm() |
Fits generalized linear models, including logistic regression (binomial family). | Logistic Regression |
| knitr | kable() |
Creates formatted tables for reports. | Reproducible Reporting |
| knitr | knitr::opts_chunk$set() |
Sets global chunk options in R Markdown. | Reproducible Reporting |
Appendix C — Functions References
D Functions Reference
This table consolidates the packages and commands used throughout the book, what each command does, and where it is first introduced.