T4 - Optimization of Infection Duration

In this tutorial, we’ll introduce the concept of parameter optimization against an optimization function – in this case, maximizing the mean infection duration in naive infectious challenges by changing the antigenic switching rate parameter.

We’ll start by defining a function to perform multi-individual challenges (similar to the last tutorial)

[ ]:

import numpy as np
import pandas as pd
import xarray as xr
import matplotlib.pyplot as plt

from emodlib.malaria import IntrahostComponent, create_config


def multiple_challenges(config, n_people, duration):

    asexuals = np.zeros((n_people, duration))
    gametocytes = np.zeros((n_people, duration))
    pp = [IntrahostComponent.create(config) for _ in range(n_people)]
    _ = [p.challenge() for p in pp]

    for t in range(duration):
        for i, p in enumerate(pp):
            p.update(dt=1)
            asexuals[i, t] = p.parasite_density
            gametocytes[i, t] = p.gametocyte_density

    da = xr.DataArray(dims=('individual', 'time', 'channel'),
                      coords=(range(n_people), range(duration), ['parasite_density', 'gametocyte_density']))

    da.loc[dict(channel='parasite_density')] = asexuals
    da.loc[dict(channel='gametocyte_density')] = gametocytes

    return da

We’ll also define a helper function to determine the duration of challenge infections based on the time index of the last non-zero parasite density

[2]:

def get_last_nonzero_by_row(A):
    """ https://stackoverflow.com/a/39959511 """
    return np.arange(A.shape[0]), A.shape[1] - 1 - (A[:, ::-1]!=0).argmax(1)

Then we’ll define our objective function:

log-uniform sampling of the antigen switching rate within a defined range
creating a model config with that parameter value
running the multi-individual challenge time-series
returning the mean infection-duration value

[ ]:

def objective(trial):

    n_people = 50
    duration = 500

    antigen_switch_rate = trial.suggest_float("Antigen_Switch_Rate", 5e-10, 5e-8, log=True)
    config = create_config({'infection_params': {'Antigen_Switch_Rate': antigen_switch_rate}})

    da = multiple_challenges(config, duration=duration, n_people=n_people)
    infection_durations = get_last_nonzero_by_row(da.sel(channel='parasite_density').values)[1]

    return infection_durations.mean()

Finally, we’ll create an optuna study and run a number of optimization trials…

[4]:

import optuna

study = optuna.create_study(study_name='maximize_duration', direction='maximize')
study.optimize(objective, n_trials=25)

[I 2023-05-26 18:53:14,086] A new study created in memory with name: maximize_duration
[I 2023-05-26 18:53:14,124] Trial 0 finished with value: 249.32 and parameters: {'Antigen_Switch_Rate': 1.1915470318960563e-09}. Best is trial 0 with value: 249.32.
[I 2023-05-26 18:53:14,171] Trial 1 finished with value: 399.04 and parameters: {'Antigen_Switch_Rate': 2.750348541347495e-09}. Best is trial 1 with value: 399.04.
[I 2023-05-26 18:53:14,218] Trial 2 finished with value: 321.86 and parameters: {'Antigen_Switch_Rate': 1.2742403528792963e-08}. Best is trial 1 with value: 399.04.
[I 2023-05-26 18:53:14,257] Trial 3 finished with value: 291.14 and parameters: {'Antigen_Switch_Rate': 1.4323327599950657e-09}. Best is trial 1 with value: 399.04.
[I 2023-05-26 18:53:14,303] Trial 4 finished with value: 351.2 and parameters: {'Antigen_Switch_Rate': 7.727469068951856e-09}. Best is trial 1 with value: 399.04.
[I 2023-05-26 18:53:14,351] Trial 5 finished with value: 280.38 and parameters: {'Antigen_Switch_Rate': 4.6148422747495935e-08}. Best is trial 1 with value: 399.04.
[I 2023-05-26 18:53:14,383] Trial 6 finished with value: 170.26 and parameters: {'Antigen_Switch_Rate': 5.363872536158119e-10}. Best is trial 1 with value: 399.04.
[I 2023-05-26 18:53:14,429] Trial 7 finished with value: 359.2 and parameters: {'Antigen_Switch_Rate': 6.2266808194259075e-09}. Best is trial 1 with value: 399.04.
[I 2023-05-26 18:53:14,475] Trial 8 finished with value: 342.46 and parameters: {'Antigen_Switch_Rate': 8.715215269434397e-09}. Best is trial 1 with value: 399.04.
[I 2023-05-26 18:53:14,508] Trial 9 finished with value: 184.14 and parameters: {'Antigen_Switch_Rate': 6.224321379915857e-10}. Best is trial 1 with value: 399.04.
[I 2023-05-26 18:53:14,557] Trial 10 finished with value: 400.04 and parameters: {'Antigen_Switch_Rate': 2.7055736726897753e-09}. Best is trial 10 with value: 400.04.
[I 2023-05-26 18:53:14,603] Trial 11 finished with value: 386.52 and parameters: {'Antigen_Switch_Rate': 2.7040378193028815e-09}. Best is trial 10 with value: 400.04.
[I 2023-05-26 18:53:14,650] Trial 12 finished with value: 399.72 and parameters: {'Antigen_Switch_Rate': 3.0708647379292074e-09}. Best is trial 10 with value: 400.04.
[I 2023-05-26 18:53:14,697] Trial 13 finished with value: 393.78 and parameters: {'Antigen_Switch_Rate': 3.2067748660456205e-09}. Best is trial 10 with value: 400.04.
[I 2023-05-26 18:53:14,744] Trial 14 finished with value: 394.92 and parameters: {'Antigen_Switch_Rate': 4.169512643087303e-09}. Best is trial 10 with value: 400.04.
[I 2023-05-26 18:53:14,786] Trial 15 finished with value: 332.96 and parameters: {'Antigen_Switch_Rate': 1.6758842380026993e-09}. Best is trial 10 with value: 400.04.
[I 2023-05-26 18:53:14,832] Trial 16 finished with value: 385.66 and parameters: {'Antigen_Switch_Rate': 4.51732112640819e-09}. Best is trial 10 with value: 400.04.
[I 2023-05-26 18:53:14,876] Trial 17 finished with value: 363.84 and parameters: {'Antigen_Switch_Rate': 2.1226773056736917e-09}. Best is trial 10 with value: 400.04.
[I 2023-05-26 18:53:14,913] Trial 18 finished with value: 236.3 and parameters: {'Antigen_Switch_Rate': 9.94876719956596e-10}. Best is trial 10 with value: 400.04.
[I 2023-05-26 18:53:14,957] Trial 19 finished with value: 370.66 and parameters: {'Antigen_Switch_Rate': 1.814725113148747e-09}. Best is trial 10 with value: 400.04.
[I 2023-05-26 18:53:15,007] Trial 20 finished with value: 396.76 and parameters: {'Antigen_Switch_Rate': 4.011620108155682e-09}. Best is trial 10 with value: 400.04.
[I 2023-05-26 18:53:15,053] Trial 21 finished with value: 391.16 and parameters: {'Antigen_Switch_Rate': 2.86932123142975e-09}. Best is trial 10 with value: 400.04.
[I 2023-05-26 18:53:15,098] Trial 22 finished with value: 375.62 and parameters: {'Antigen_Switch_Rate': 2.2297814196138632e-09}. Best is trial 10 with value: 400.04.
[I 2023-05-26 18:53:15,145] Trial 23 finished with value: 409.2 and parameters: {'Antigen_Switch_Rate': 2.9439327747062316e-09}. Best is trial 23 with value: 409.2.
[I 2023-05-26 18:53:15,182] Trial 24 finished with value: 238.94 and parameters: {'Antigen_Switch_Rate': 9.439553321717197e-10}. Best is trial 23 with value: 409.2.

[5]:

study.best_params  # parameter for longest avg duration

[5]:

{'Antigen_Switch_Rate': 2.9439327747062316e-09}

[6]:

study.best_value  # longest avg duration

[6]:

409.2

Now let’s look at a few default visualizations of the optuna study:

the convergence towards maximizing the objective value over successive trials
the value of the objective as a function of our 1-d parameter range explored

[7]:

import plotly
plotly.offline.init_notebook_mode()  # to render in docs

fig = optuna.visualization.plot_optimization_history(study, target_name="Mean infection duration (d)")
fig.show()

[8]:

plotly.offline.init_notebook_mode()  # to render in docs
fig = optuna.visualization.plot_slice(study, target_name="Mean infection duration (d)")
fig.show()