

The loss functions are one of the main components of deepdow. Please review the Loss in Basics to understand the setup. Most importantly, by a loss function we mean any function that has the following two inputs

  • weights - torch.Tensor of shape (n_samples, n_assets)

  • y - torch.Tensor of shape (n_samples, n_channels, horizon, n_assets)

And a single output

  • loss - torch.Tensor of shape (n_samples,)


Similarly to layers (see Layers), all the deepdow losses assume that the input and output tensors have an extra dimension in the front—the sample dimension. It serves for batching when training the networks. For this reason, losses must be implemented in a way that all samples are independent.

The above definition of a loss function is very general and in many cases deepdow losses focus on a more narrow family of functions in the background. To be more specific, one can select a single channel (returns_channel) from the y tensor representing the desired returns. After this, portfolio returns r over each of the horizon steps can be computed which results in a tensor of shape (n_samples, horizon). By applying some summarization function S over the horizon dimension we arrive at the final output loss of shape (n_samples,).


Before we start discussing losses themselves let us first write down multiple definitions. Let us assume, that before investing, we have initial holdings of \(V\) (in cash). Additionally, for each asset \(a\) we denote its price at time \(t\) as \(p^{a}_{t}\). Given some portfolio weights \(\textbf{w}\) over \(N\) assets we define portfolio value at time t

\[p^{\textbf{w}}_t = \sum_{a=1}^{N} p_t^a \frac{w_a V}{p_0^a}\]

Before we continue, notice that the above definition assumes two things

  • We employ the buy and hold strategy

  • Assets are perfectly divisible (one can by \(\frac{w_a V}{p_0^a}\) units of any asset)

Let us now define two define two types of asset returns: simple and logarithmic

\[ \begin{align}\begin{aligned}{}^{\text{S}}r^{a}_{t} = \frac{p^{a}_{t}}{p^{a}_{t-1}} - 1\\ {}^{\text{L}}r^{a}_{t} = \log \frac{p^{a}_{t}}{p^{a}_{t-1}}\end{aligned}\end{align} \]

Additionally, we also consider their portfolio counterparts

\[ \begin{align}\begin{aligned}{}^{\text{S}}r^{\textbf{w}}_{t} = \frac{p^{\textbf{w}}_{t}}{p^{\textbf{w}}_{t-1}} - 1\\ {}^{\text{L}}r^{\textbf{w}}_{t} = \log \frac{p^{\textbf{w}}_{t}}{p^{\textbf{w}}_{t-1}}\end{aligned}\end{align} \]

Note that in both of the cases the initial holding \(V\) cancels out and the portfolio returns are independent of it.

Portfolio returns

One can extract portfolio returns given asset returns via the function portfolio_returns. It inputs a matrix of asset returns (the returns type is controlled via input_type)

\[\begin{split}\begin{bmatrix} r^{1}_1 & \dots & r^{N}_1 \\ \vdots & \ddots & \vdots \\ r^{1}_{\text{horizon}} & \dots & r^{N}_{\text{horizon}} \end{bmatrix}\end{split}\]

and outputs a vector of portfolio returns (the type is controlled via output_type)

\[\begin{split}\textbf{r}^{\textbf{w}} = \begin{bmatrix} r^{\textbf{w}}_{1} \\ \vdots \\ r^{\textbf{w}}_{\text{horizon}} \end{bmatrix}\end{split}\]

We rely on the below relation to perform the computations

from deepdow.losses import portfolio_returns

returns = torch.tensor([[[0.1, 0.2], [0.05, 0.02]]])  # (n_samples=1, horizon=2, n_asset=2)
weights = torch.tensor([[0.4, 0.6]])  # (n_samples=1, n_samples=2)

prets = portfolio_returns(weights, returns, input_type='simple', output_type='simple')

assert prets.shape == (1, 2)  # (n_samples, horizon)
assert torch.allclose(prets, torch.tensor([[0.1600, 0.0314]]), atol=1e-4)

Available losses

To avoid confusion, all the available losses have the “The lower the better” logic. If the class name suggests otherwise (i.e. MeanReturns) a negative is computed instead. For the exact usage see deepdow.losses module.


Negative alpha with respect to a predefined portfolio of assets. If benchmark_weights=None then considering the equally weighted portfolio by default.


Negative simple cumulative of the buy and hold portfolio at the end of the horizon steps.

\[\frac{p^{\textbf{w}}_{t + \text{horizon}}}{p^{\textbf{w}}_{t}} - 1\]


Loss function independent of y, only taking into account the weights.



The negative of the maximum drawdown.


The negative of mean portfolio returns over the horizon time steps.

\[{\mu}^{\textbf{w}} = \frac{\sum_{i}^{\text{horizon}} r^{\textbf{w}}_{i} }{\text{horizon}}\]


\[\sum_{i=1}^{N}\Big(\frac{\sigma}{N} - w_i \big(\frac{\Sigma\textbf{w}}{\sigma}\big)_i\Big) ^ 2\]

where \(\sigma=\sqrt{\textbf{w}^T\Sigma\textbf{w}}\) and \(\Sigma\) is the covariance matrix of asset returns.

Quantile (Value at Risk)

The negative of the p-quantile of portfolio returns. Note that in the background it solved via torch.kthvalue.


The negative of the Sharpe ratio of portfolio returns.

\[\frac{{\mu}^{\textbf{w}} - r_{\text{rf}}}{{\sigma}^{\textbf{w}} + \epsilon}\]


The negative of the Sortino ratio of portfolio returns.

\[\frac{{\mu}^{\textbf{w}} - r_{\text{rf}}}{\sqrt{\frac{\sum_{i}^{\text{horizon}} \max({\mu}^{\textbf{w}} - r^{\textbf{w}}_{i} , 0)^{2}}{\text{horizon}}} + \epsilon}\]


Loss function independent of y, only taking into account the weights.

\[\sum_{i=1}^{N} w_i^2\]

The lower this loss is, the more diversified our portfolio is. If we focus on two extremes, for the equally weighted it is \(\frac{1}{N}\). For a single asset portfolio it is \(1\).


\[{\sigma}^{\textbf{w}} = \sqrt{\frac{\sum_{i}^{\text{horizon}} (r^{\textbf{w}}_{i} - {\mu}^{\textbf{w}})^{2}}{\text{horizon}}}\]


The negative of the minimum returns


Arithmetic operations

deepdow offers a powerful feature of performing arithmetic operations between loss instances. In other words, one can obtain new losses by performing unary and binary operations on existing losses.

Lets assume we have a loss instance, then the available operations are


  • addition of a constant

  • multiplication by a constant

  • division by a constant

  • exponentiation


  • addition of another loss

  • multiplication by another loss

  • division by another loss


Currently, the __repr__ of a loss that is a result of an arithmetic operation is just a naive string concatenation of __repr__ of the constituent losses. No symbolic mathematics and expression reduction is utilized.