This document presents the statistical catch-at-age stock assessment model developed as part of the Assessment For All (a4a) initiative of the European Commission Joint Research Centre. The stock assessment model framework is a non-linear catch-at-age model implemented in R and FLR, and using ADMB that can be applied rapidly to a wide range of situations with low parametrization requirements. The model structure is defined by submodels, which are the different parts that require structural assumptions. There are 5 submodels in operation: a model for F-at-age, a model for the initial age structure, a model for recruitment, a (list) of model(s) for abundance indices catchability-at-age, and a list of models for the observation variance of catch-at-age and abundance indices. The submodels form use linear models. This opens the possibility of using the linear modelling tools available in R: see for example the mgcv (http://cran.r-project.org/web/packages/mgcv/index.html) gam formulas, or factorial design formulas using. Detailed model formulas, several diagnostic tools and a large set of models are presented in the document. Additionaly, advanced features like external weighting of the likelihood components and MCMC fits are also described. The target audience for this document are readers with some experience in R and some background on stock assessment. The document explains the approach being developed by a4a for fish stock assessment and scientific advice. It presents a mixture of text and code, where the first explains the concepts behind the methods, while the last shows how these can be run with the software provided.

To follow this tutorial you should have installed the following packages:

You can do so as follows,

```
install.packages(c("copula","triangle", "coda", "XML", "reshape2", "latticeExtra"))
# from FLR
install.packages(c("FLCore", "FLa4a"), repos="http://flr-project.org/R")
```

```
# This chunk loads all necessary packages
library(FLa4a)
library(latticeExtra)
# datasets
data(ple4)
data(ple4.indices)
data(ple4.index)
```

The stock assessment model framework is a non-linear catch-at-age model implemented in R/FLR/ADMB that can be applied rapidly to a wide range of situations with low parametrization requirements.

In the a4a assessment model, the model structure is defined by submodels, which are the different parts of a statistical catch at age model that require structural assumptions.

There are 5 submodels in operation: - a model for F-at-age, - a (list) of model(s) for abundance indices catchability-at-age, - a model for recruitment, - a list of models for the observation variance of catch-at-age and abundance indices, - a model for the initial age structure,

In practice, we fix the variance models and the initial age structure models, but in theory these can be changed.

The submodels form use linear models. This opens the possibility of using the linear modelling tools available in R: see for example the mgcv gam formulas, or factorial design formulas using `lm()`

. In Râ€™s linear modelling language, a constant model is coded as \(\sim 1\), while a slope over age would simply be \(\sim age\). For example, we can write a traditional year/age separable F model like \(\sim factor(age) + factor(year)\).

The â€˜languageâ€™ of linear models has been developing within the statistical community for many years, and constitutes an elegant way of defining models without going through the complexity of mathematical representations. This approach makes it also easier to communicate among scientists:

- 1965 J. A. Nelder, notation for randomized block design
- 1973 Wilkinson and Rodgers, symbolic description for factorial designs
- 1990 Hastie and Tibshirani, introduced notation for smoothers
- 1991 Chambers and Hastie, further developed for use in S

There are two basic types of assessments available in a4a: the management procedure fit and the full assessment fit. The management procedure fit does not compute estimates of covariances and is therefore quicker to execute, while the full assessment fit returns parameter estimates and their covariances at the expense of longer fitting time.

Modelled catches \(C\) are defined in terms of the three quantities, natural mortality \(M\), fishing mortality \(F\) and recruitment \(R\), using a modified form of the well known Baranov catch equation:

\[ C_{ay} = \frac{\bf{F}_{ay}}{\bf{F}_{ay}+M_{ay}}\left(1 - e^{-(\bf{F}_{ay}+M_{ay})}\right) \bf{R}_{y}e^{-\sum (\bf{F}_{ay} + M_{ay})} \]

where \(a\) and \(y\) denote age and year. Modelled survey indices \(I\) are defined in terms of the same three quantities with the addition of survey catchability \(Q\):

\[I_{ays} = \bf{Q}_{ays} \bf{R}_{y}e^{-\sum (\bf{F}_{ay} + M_{ay})}\]

where \(s\) denotes survey or abundance index and allows for multiple surveys to be considered. Observed catches \(C^{(obs)}\) and the observed survey indices \(I^{(obs)}\) are assumed to be log-normally distributed, or equivalently, normally distributed on the log-scale, with age, year and survey specific observation variance:

\[ \log C^{(obs)}_{ay} \sim \text{Normal} \Big( \log C_{ay}, \bf{\sigma}^2_{ay}\Big) \qquad \log I^{(obs)}_{ays} \sim \text{Normal} \Big( \log I_{ays}, \bf{\tau}^2_{ays} \Big) \]

The full log-likelihood for the a4a statistical catch at age model can now be defined as the sum of the log-likelihood of the observed catches (\(\ell_N\) is the log-likelihood of a normal distribution)

\[ \ell_C = \sum_{ay} w^{(c)}_{ay}\ \ell_N \Big( \log C_{ay}, \bf{\sigma}^2_{ay} ;\ \log C^{(obs)}_{ay} \Big) \]

and the log-likelihood of the observed survey indices

\[ \ell_I = \sum_s \sum_{ay} w^{(s)}_{ays}\ \ell_N \Big( \log I_{ays}, \bf{\tau}_{ays}^2 ;\ \log I^{(obs)}_{ays} \Big)\]

giving the total log-likelihood

\[\ell = \ell_C + \ell_I\]

which is defined in terms of the strictly positive quantites, \(M_{ay}\), \(F_{ay}\), \(Q_{ays}\) and \(R_{y}\), and the observation variances \(\sigma_{ay}\) and \(\tau_{ays}\). As such, the log-likelihood is over-parameterised as there are many more parameters than observations. In order to reduce the number of parameters, \(M_{ay}\) is assumed known (as is common), and the remaining parameters are written in terms of a linear combination of covariates \(x_{ayk}\), e.g.

\[\log F_{ay} = \sum_k \beta_k x_{ayk}\]

where \(k\) is the number of parameters to be estimated and is sufficiently small. Using this tecnique the quantities \(\log F\), \(\log Q\), \(\log \sigma\) and \(\log \tau\) %\(\log \text{initial\,age\,structure}\) % this is not present in the above (in bold in the equations above) can be described by a reduced number of parameters. The following section has more discussion on the use of linear models in a4a.

The a4a statistical catch at age model can addiionally allow for a functional relationship to be imposed that links predicted recruitment \(\tilde{R}\) based on spawning stock biomass and modelled recruitment \(R\), included as a fixed variance random effect. Options for the relationship are the hard coded models Ricker, Beverton Holt, smooth hockeystick or geometric mean. This is implemented by including a third component in the log-likelihood

\[\ell_{SR} = \sum_y \ell_N \Big( \log \tilde{R}_y(a, b), \phi_y^2 ;\ \log R_y \Big)\]

giving the total log-likelihood

\[\ell = \ell_C + \ell_I + \ell_{SR}\]

Using the (time varying) Ricker model as an example, predicted recruitment is

\[\tilde{R}_y(a_y,b_y) = a_y S_{y-1} e^{-b_y S_{y-1}}\]

where \(S\) is spawning stock biomass derived from the model parameters \(F\) and \(R\), and the fixed quantites \(M\) and mean weights by year and age. It is assumed that \(R\) is log-normally distributed, or equivalently, normally distributed on the log-scale about the (log) recruitment predicted by the SR model \(\tilde{R}\), with known variance \(\phi^2\), i.e.

\[\log R_y \sim \text{Normal} \Big( \log \tilde{R}_y, \phi_y^2 \Big)\]

which leads to the definition of \(\ell_{SR}\) given above. In all cases \(a\) and \(b\) are strictly positive, and with the quantities \(F\), \(R\), etc. linear models are used to parameterise \(\log a\) and/or \(\log b\), where relevant.

By default, recruitment \(R\) as apposed to the reruitment predicted from a stock recruitment model \(\tilde{R}\), is specified as a linear model with a parameter for each year, i.e.

\[\log R_y = \gamma_y\]

This is to allow modelled recruitment \(R_y\) to be shrunk towards the stock recruitment model. However, if it is considered appropriate that recruitment can be determined exactly by a relationship with covariates, it is possible, to instead define \(\log R\) in terms of a linear model in the same way as \(\log F\), \(\log Q\), \(\log \sigma\) and \(\log \tau\). %But this is pretty much the same as taking a geometric mean, with a model on log a, and making the variance very small.

Here we show a simple example of using the assessment model using plaice in the North Sea. The default settings of the stock assessment model work reasonably well. Itâ€™s an area of research that will improve with time. Note that because the survey index for plaice has missing values we get a warning saying that we assume these values are missing at random.

```
data(ple4)
data(ple4.indices)
fit <- sca(ple4, ple4.indices)
```

To inspect the stock assessment summary constituted of trends of fishing mortality (harvest), spawning stock biomass (SSB), catch and recruits, the user may add the `a4aFit`

object to the original `FLStock`

object using the `+`

method and plot the result (Figure 1).

```
stk <- ple4 + fit
plot(stk)
```