In toxicology, the dose of a pathogen, radiation, or toxin required to kill half the members of a tested population of animals or cells is called the lethal dose, 50%, or LD_{50}. This measure is also known as the lethal concentration, LC_{50}, or properly after a specified test duration, the LCt_{50} indicating the lethal concentration and time of exposure. LD_{50} figures are frequently used as a general indicator of a substance’s acute toxicity. A lower LD_{50} is indicative of increased toxicity.

More generally, the point at which 50% response of studied organisms to range of doses of a substance (e.g., agonist, antagonist, inhibitor, etc.) to any response, from change in behavior or life history characteristics up to and including death can be described by the methods described in this chapter. The procedures outlined below assume that there is but one inflection point, i.e., an “s-shaped” curve, either up or down; if there are more than one inflection points, then the logistic equations described will not fit the data well and other choices need to be made (see Di Veroli et al 2015). We will use the `drc`

package (Ritz et al 2015).

#### Example

After starting R, load the `drc`

library.

Consider some hypothetical 24-hour survival data for yeast exposed to salt solutions. Let `resp`

equal the variable for frequency of survival (e.g., estimated from OD_{600} readings) and NaCl equal the millimolar (mm) salt concentrations or doses.

At the R prompt type

NaCl=seq(0,1000,100)

#To check to make sure that the sequence has been correctly created; alternatively, just enter the values.

NaCl [1] 0 100 200 300 400 500 600 700 800 900 1000

#Make a plot

plot(NaCl,resp,pch=19,cex=1.2,col=”blue”,xlab=”NaCl [mm]”,ylab=”Survival frequency”)

And here is the plot of the simulated data (Figure 1).

Note the sigmoidal shape — we’ll need an logistic equation to describe the relationship between survival of yeast and NaCl doses.

The equation for the four parameter logistic curve, also called the Hill-Slope model, is (Figure 2)

where *c* is the parameter for the lower limit of the response, *d* is the parameter for the upper limit of the response, *e* is the relative EC_{50}, the dose fitted half-way between the limits *c* and *d*, and *b* is the relative slope around the EC_{50}. The slope, *b*, is also known as the Hill slope. Because this experiment included a dose of zero, a three parameter logistic curve would be appropriate. The equation simplifies to (Figure 3)

#### EC_{50} from 4 parameter model

First, make a data frame

Next, call up a function, `drm`

, from the `drc`

library and specify the model as the four parameter logistic equation, specified as `LL.4()`

. We follow with a call to the summary command to retrieve output from the `drm`

function. Note that we are using the four-parameter logistic equation (Figure 2)

summary(model.dose1)

And here is the output from R.

Model fitted: Log-logistic (ED50 as parameter) (4 parms)

Parameter estimates:

Estimate Std. Error t-value p-value

b:(Intercept) 3.753415 1.074050 3.494636 0.0101

c:(Intercept) -0.084487 0.127962 -0.660251 0.5302

d:(Intercept) 1.017592 0.052460 19.397441 0.0000

e:(Intercept) 492.645128 47.679765 10.332373 0.0000

Residual standard error:

0.0845254 (7 degrees of freedom)

The EC_{50}, technically, because the data were for survival, the LD_{50} is *e* = 492.65 mM NaCl, where *e*, again, is the dose fitted half-way between the limits *c* and *d*.

You should always plot the predicted line from your model against the real data and inspect the fit.

At the R prompt type

As long as the plot you made in earlier steps is still available, R will add the line specified in the lines command. Here is the plot with the predicted logistic line displayed (Figure 4).

While there are additional steps we can take to decide is the fit of the logistic curve was good to the data, visual inspection suggests that indeed the curve fits the data reasonably well.

#### More work to do

Because the EC_{50} calculations are an estimate, we should also obtain confidence intervals. The `drc`

library provides a function called ED which will accomplish this. We can also ask what the survival was at 10% and 90% in addition to 50%, along with the confidence intervals for each.

At the R prompt type

And the output is shown below.

Estimated effective doses

(Delta method-based confidence interval(s))

Estimate Std. Error Lower Upper

1:10 274.348 38.291 183.803 364.89

1:50 492.645 47.680 379.900 605.39

1:90 884.642 208.171 392.395 1376.89

Thus, the 95% confidence interval for the EC_{50} calculated from the four parameter logistic curve was between the lower limit of 379.9 and an upper limit of 605.39 mm NaCl.

#### EC_{50} from 3 parameter model

Looking at the summary output from the four parameter logistic function we see that the value for *c* was -0.085 and *p*-value was 0.53, which suggests that the lower limit was not statistically different from zero. We would expect this given that the experiment had included a control of zero mm added salt. Thus, we can explore by how much the EC_{50} estimate changes when the additional parameter *c* is no longer estimated by calculating a three parameter model with` LL.3()`

. At the R prompt type

summary(model.dose2)

And here is the output.

Model fitted: Log-logistic (ED50 as parameter) with lower limit at 0 (3 parms)

Parameter estimates:

Estimate Std. Error t-value p-value

b:(Intercept) 4.46194 0.76880 5.80378 4e-04

d:(Intercept) 1.00982 0.04866 20.75272 0e+00

e:(Intercept) 467.87842 25.24633 18.53253 0e+00

Residual standard error:

0.08267671 (8 degrees of freedom)

The EC_{50} is the value of e: 467.88 mM NaCl.

How do the four and three parameter models compare? We can rephrase this as as statistical test of fit; which model fits the data better, a three parameter or a four parameter model?

At the R prompt type

The output is below

1st model

fct: LL.3()

2nd model

fct: LL.4()

ANOVA table

ModelDf RSS Df F value p value

2nd model 8 0.054684

1st model 7 0.050012 1 0.6539 0.4453

Because the p-value is much greater than 5% we may conclude that the fit of the four parameter model was not significantly better than the fit of the three parameter model. Thus, based on your criteria of model fit (e.g., select a more complicated model if it demonstrates an improvement over a model with fewer predictors), we would conclude that the three parameter model is the preferred model.

The plot below (Figure 5) now includes the fit of the four parameter model (red line) and the three parameter model (green line) to the data.

The R command to make this addition to our active plot was

We continue with our analysis of the three parameter model and produce the confidence intervals for the EC_{50} (modify the `ED()`

statement above for `model.dose2`

in place of `model.dose1`

).

Estimated effective doses

(Delta method-based confidence interval(s))

Estimate Std. Error Lower Upper

1:10 285.937 33.154 209.483 362.39

1:50 467.878 25.246 409.660 526.10

1:90 765.589 63.026 620.251 910.93

Thus, the 95% confidence interval for the EC_{50} calculated from the three parameter logistic curve was between the lower limit of 409.7 and an upper limit of 526.1 mm NaCl. The difference between upper and lower limits was 116.4 mm NaCl, a smaller difference than the interval calculated for the 95% confidence intervals from the four parameter model (225.5 mm NaCl). This demonstrates the estimation trade-off: more parameters to estimate reduces the confidence in any one parameter estimate.

#### Additional notes of EC_{50} calculations

Care must be taken that the model fits the data well. What if we did not have observations throughout the range of the sigmoidal shape? We can explore this by taking a subset of the data. At the R prompt type

Here, all values greater than dose 500 were dropped (see below for a more general approach to subset).

> dd

resp dose

1 1.0 0

2 1.0 100

3 1.0 200

4 0.9 300

5 0.7 400

6 0.3 500

and the plot does not show an obvious sigmoidal shape (Figure 6)

We run the three parameter model again, this time on the subset of the data.

summary(model.dosedd)

Output from the results are

Model fitted: Log-logistic (ED50 as parameter) with lower limit at 0 (3 parms)

Parameter estimates:

Estimate Std. Error t-value p-value

b:(Intercept) 6.989842 0.760112 9.195801 0.0027

d:(Intercept) 0.993391 0.014793 67.153883 0.0000

e:(Intercept) 446.882542 5.905728 75.669344 0.0000

Residual standard error:

0.02574154 (3 degrees of freedom)

Conclusion? The estimate is different, but only just so, 447 vs. 468 mm NaCl. Thus, within reason, the `drc`

function performs well for the calculation of EC_{50}. Not all tools available to the student will do as well.

Use the `subset`

function instead:

#### References

Di Veroli G. Y., Fornari C., Goldlust I., Mills G., Koh S. B., Bramhall J. L., Richards, F. M., Jodrell D. I. (2015) An automated fitting procedure and software for dose-response curves with multiphasic features. *Scientific Reports* 5: 14701. (doi: 10.1038/srep14701)

Ritz, C., Baty, F., Streibig, J. C., Gerhard, D. (2015) Dose-Response Analysis Using R. *PLOS ONE*, 10(12),

e0146021 (doi: 10.1371/journal.pone.0146021