## Welcome in SHADE

### Brief overview

SHADE is a Shiny application for providing support to researchers in design of experiments. It includes a tool to perform power analysis for usual test statistics, as well as a reporter tool to draft the ethics committee form for animal experimentation (CETEA), in a user-friendly interface. Initially created to design experiments using animals, SHADE might be used to design any experiments (cell culture…).

The global workflow of SHADE is detailed hereafter:

### What's new in SHADE

**Upcoming release**-

**SHADE version 4.0**

Future version is being developped to improve user interface and facilitate navigation

**October 2023**-

**New version of SHADE is online !**

User interface improved with an advanced tab for exploring parameters and language option in final report. Cage effect visualisation was also improved.

**February 2021**-

**Version 2.0 released**

User interface was improved with validation buttons, tabs for exploring parameters. Bugs were fixed such as external data loading and numerical errors in power analysis.

**March 2019**-

**SHADE is online**

Implementing power analysis for frequently used statistical tests, power graph and a report for CETEA form.

### Authors

#### Aim:

SHADE is an application to help scientists refine their experiment design, notably when the design is nested (e.g., animals are collectively bred in cages, etc.). The application proceeds step by step :

- choosing the type of statistical test you want to make (e.g. paired t-test) ;
- exploring and setting parameters of the analysis (preliminary data may be used to estimate effect sizes - see the “optional” sub-step) such as power or sample size;
- making a report. Note that some advanced options are available in a separate page.

Depending on the amount of information you have in hand, you might want to use the application in different manners :

Basic approach : no information available

Partially informed approach : some information is available i.e. standard deviation for at least one experimental group

Refined approach : you have quantities needed to estimate effect size i.e. means and standard deviations for at least 2 experimental groups

#### How:

**Basic approach**

–> *Level of information : very poor (no preliminary data)*

It is not rare for a scientist to start from scratch ! Imagine you have some work contraints (budget, amount of time, room, etc.) and you do not have a guess about the strength of the experimental effect you are studying. In this case, we advise you to estimate the strength of the exprimental effect you will be able to show, given a fixed number of animals. For instance, to study the effects of a factor of interest (group A vs. group B), suppose that you could not use more than \(N\) mice per experimental group (and \(M\) mice per cage) given some work constraints. In this case, the best option is to crudely assess the resolution of your expreriment by estimating the effect size, that is, the minimal size of the experimental effect you may be able to show, given a number of observations per group.

Input parameters involved :

- alpha - controlling the probability of making a mistake by concluding the experimental effect is significant while it is not the case. Also known as Type I Error or False Discovery Rate. It is common to set this parameter at alpha = 0.05.
- beta - controlling the probability to miss a genuine effect by performing a statistical test. If you are not very familiar with this parameter we advise to fix it at beta = 0.8 (i.e., 80% chance you would not miss a genuine experimental effect by performing a statistical test).
- n - the number of individuals per group. Given your own constraints you may have a guess about the number of observations you can reasonably handle.
- if your experimental design is nested (observations are nested within experimental units, e.g., cages, aquariums, petri dishes, etc.), you can make it explicit by ticking the corresponding box and gauge the impact of technical effects. Quantifying these technical effects might be necessary to correctly design your experiment ; they may affect the sample size when their impact is rather substantial.

Output parameter :

- the effect size - a standardized estimate of the strength of the experimental factor under study. See the glossary for the formulas used in SHADE.

**Partially informed approach**

–> *Level of information you have in hand : medium (standard deviation in at least one experimental group)*

WIP - Coming soon

**Refined approach**

–> *Level of information : good (means and standard deviations in every experimental groups)*

In other contexts, some bits of information may be available. For example, you may be able to guess the strength of the experimental effect you are interested in, (i) based on preliminary data (mean and standard deviation) you have in hand or (ii) simply based on similar experiments that were published on similar study systems. The key estimate here is an “effect size” that reflects the amount of signal in your data (the amount of variation among experimental group) with respect to the amount of noise (technical and/or biological variation within experimental group). In this case, the point will be to (i) estimate the adequate effect size index and then to (ii) calculate a number of individuals based on this effect size index, a desired power (resolution - usually > 80%), and a chosen risk to get a false positive outcome (it is common to set alpha = 0.05). Indeed, depending on the analysis chosen, e.g. “comparing two groups” or “comparing k (k > 2) experimental groups” the effect size will not be estimated using the same metric. See the glossary hereafter for the formulas used in SHADE.

Input parameters involved :

- alpha - controlling the probability of making a mistake by concluding the experimental effect is significant while it is not the case. It is common to set this parameter at alpha = 0.05.
- beta - controlling the probability to miss a genuine effect by performing a statistical test. If you are not very familiar with this parameter we advise to fix it at beta = 0.8 (i.e., 80% chance you would not miss a genuine experimental effect by performing a statistical test).
- the effect size - a standardized estimate of the strength of the experimental factor under study. Based on preliminary data, this parameter value may be estimated with SHADE (see the section “Optional: estimate effect size”).
- if your experimental design is nested (observations are nested within experimental units, e.g., cages, aquariums, petri dishes, etc.), you can make it explicit by ticking the corresponding box and gauge the impact of technical effects. Quantifying these technical effects might be necessary to correctly design your experiment ; they may affect the sample size when their impact is rather substantial.

Output parameter :

- n - the number of individuals per group. By considering technical effects explicitely, you may see the impact of the latter on the initial estimate (i.e., if there were no experimental effect).

Would you need help or guidance regarding an experiment plan, do not hesitate to contact us.

#### Aim:

The mathematical functions to compute power analysis are specific to each test statistic. Therefore, once clicking on **Step1: set test family** tab, you may have to chose the type of test statistic you will perform on your data.

#### How:

**(1a)** Six statistical tests are available in SHADE. The choice by default is the comparison of two-means for unpaired data (e.g. comparing the body weight of treated and control animals).

**(1b)** If you chose the comparison of more than two means (Analysis of Variance), you may specify the number of groups to compare. Default is 3 (e.g. 3 doses or 3 timepoints)

**(2)** Note that a graphical representation of synthetic data appears on the right panel for each test to help for decision. There is a match between data representation and the test statistic associated to it.

**(3)** Next, you may specify the way to compute statistical significance of your test, aka the type of alternative hypothesis of your test.

By default, a power analysis for a two-sided (or two-tailed) test is performed. Indeed, this option is appropriate if you have no prior knowledge on the direction of the change.

A one-sided (or one-tailed) test is appropriate if the biomarker may depart from the reference (control) value in only one direction, left or right, but not both. An example can be whether animals are fed by a high-fat diet and you expect that the body weight can only increase compared to the control diet, otherwise the diet has no effect. Note that one-sided tests are more powerful.

**(4)** At the end, a summary of your choices is printed out and you may validate your choices or restore to default settings (comparison of two-means, unpaired, two-sided) before going to the next step **Step2: run power analysis**.

#### Note:

- buttons contain more information about the analysis options, do not hesitate to consult them.

#### Aim:

The second step is there to help you define the parameters for the power analysis in itself. Three main parameters are to be defined in order to estimate the necessary number of individuals per group:

- Type I error
- Power
- Expected experimental effect (magnitude)

Additionally, you can choose to take into account the hierarchical structure of your data by including a cage effect. You'll then have to define the magnitude of this effect and the number of individuals in each cage. Three tabs are available to guide you in the process, and to help you visualize the link between all parameters. If you are unsure of values you want to use, start by exploring the parameters in the second tab. Once you are confident with the values, go back to the first tab to confirm your choices.

#### How:

- Validate parameters :

In the first box **(1a)**, you have three sliders corresponding to the three parameters you have to define. Default values are given but not necessarily adapted to your analysis.

By clicking on the **Account for cage effect** checkbox, two more sliders appear to define the magnitude of cage effect and the number of individuals per cage.

Once you have set the values according to your needs, the second box **(1b)** displays a small summary for you to check.
You can confirm your choices and go to the next step, or restore default values.

- Explore parameters :

This tab is here to help you visualize the meaning of the different parameters you need to define in the first tab.

A first box **(2a)** allows you to select two variables that will be plotted in the interactive power graph plot **(2b)**.
The four variables of a power analysis are available: type I error (alpha), power, effect size and number of individuals per group.
With the curve, you can for example see which minimum number of individuals is required to obtain a certain level of power.
Two curves will be displayed if you chose to take into account for cage effect : the estimations are impacted by the magnitude of the effect of hierarchical structure.

The second graph **(2d)** is a simulation adapted to the choices made in **Step 1 - set test family**.

This graph reacts to the values you set in the box next to it **(2c)**, which contains the same sliders than the validation step, and also impact the power graph interactive plot.

Feel free to move the sliders and see how they modify in the simulation and the power graph. You can also re-run the simulation with the **New simulation** button.

- Statistical analysis (To go further) :