deepdow.utils module

Collection of utilities and helpers.

class ChangeWorkingDirectory(directory)[source]

Bases: object

Context manager that changes current working directory.

Parameters

directory (str or pathlib.Path or None) – The new working directory. If None then staying in the current one.

_previous

The original working directory we want to return to after exiting the context manager.

Type

pathlib.Path

class PandasChecks[source]

Bases: object

General checks for pandas objects.

static check_indices_agree(*frames)[source]

Check if inputs are pd.Series or pd.DataFrame with same indices / columns.

Parameters

frames (list) – Elements are either pd.Series or pd.DataFrame.

Raises
  • TypeError – If elements are not pd.Series or pd.DataFrame.

  • IndexError – If indices/colums do not agree.

static check_no_gaps(index)[source]

Check if a time index has no gaps.

Parameters

index (pd.DatetimeIndex) – Time index to be checked for gaps.

Raises
  • TypeError – If inconvenient type.

  • IndexError – If there is a gap.

static check_valid_entries(table)[source]

Check if input table has no nan or +-inf entries.

Parameters

table (pd.Series or pd.DataFrame) – Input table.

Raises
  • TypeError – Inappropriate type of table.

  • ValueError – At least one entry invalid.

prices_to_returns(prices, use_log=True)[source]

Convert prices to returns.

Parameters
  • prices (pd.DataFrame) – Rows represent different time points and the columns represent different assets. Note that the columns can also be a pd.MultiIndex.

  • use_log (bool) – If True, then logarithmic returns are used (natural logarithm). If False, then simple returns.

Returns

returns – Returns per asset per period. The first period is deleted.

Return type

pd.DataFrame

raw_to_Xy(raw_data, lookback=10, horizon=10, gap=0, freq='B', included_assets=None, included_indicators=None, use_log=True)[source]

Convert raw data to features.

Parameters
  • raw_data (pd.DataFrame) – Rows represents different timestamps stored in index. Note that there can be gaps. Columns are pd.MultiIndex with the zero level being assets and the first level indicator.

  • lookback (int) – Number of timesteps to include in the features.

  • horizon (int) – Number of timesteps to included in the label.

  • gap (int) – Integer representing the number of time periods one cannot act after observing the features.

  • freq (str) – Periodicity of the data.

  • included_assets (None or list) – Assets to be included. If None then all available.

  • included_indicators (None or list) – Indicators to be included. If None then all available.

  • use_log (bool) – If True, then logarithmic returns are used (natural logarithm). If False, then simple returns.

Returns

  • X (np.ndarray) – Feature array of shape (n_samples, n_indicators, lookback, n_assets).

  • timestamps (pd.DateTimeIndex) – Per row timestamp of shape length n_samples.

  • y (np.ndarray) – Targets arra of shape (n_samples, n_indicators, horizon, n_assets).

  • asset_names (list) – Names of assets.

  • indicators (list) – List of indicators.

returns_to_Xy(returns, lookback=10, horizon=10, gap=0)[source]

Create a deep learning dataset (in memory).

Parameters
  • returns (pd.DataFrame) – Returns where columns represent assets and rows timestamps. The last row is the most recent.

  • lookback (int) – Number of timesteps to include in the features.

  • horizon (int) – Number of timesteps to inclued in the label.

  • gap (int) – Integer representing the number of time periods one cannot act after observing the features.

Returns

  • X (np.ndarray) – Array of shape (N, 1, lookback, n_assets). Generated out of the entire dataset.

  • timestamps (pd.DateTimeIndex) – Index corresponding to the feature matrix X.

  • y (np.ndarray) – Array of shape (N, 1, horizon, n_assets). Generated out of the entire dataset.