Search Space

THIS IS THE MOST IMPORTANT CONCEPT OF TUNE, MUST READ

Tune defines its own searching space concept and different expressions. It inherits the Fugue philosophy: one expression for all frameworks. For the underlying optimizers (e.g. HyperOpt, Optuna), tune unifies the behaviors. For example Rand(1.0, 5.0, q=1.5) will uniformly search on [1.0 , 2.5, 4.0] no matter you use HyperOpt or Optuna as the underlying optimizer.

In Tune, spaces are predefined before search, it is opposite to Optuna where you get variables inside objectives during runtime. In this way, your space definition is totally separated from objective definition, and your objectives may be just simple python functions independent from Tune.

[1]:
from tune import Space, Grid, Rand, RandInt, Choice
import pandas as pd

Simple Cases

The simplest cases are spaces with only static variables. So the spaces will always generate single configuration.

[2]:
space = Space(a=1, b=1)
print(list(space))
[{'a': 1, 'b': 1}]

Random Expressions

Random search requires .sample method after you define the original space to specify how many random combinations you want to draw from the expression.

Choice

Choice refers to discrete unordered set of values. So Choice(1, 2, 3) is equivalent to Choide(2, 1, 3). When you do random sampling from Choice, every value has equal chance. When you do advanced search such as Bayesian Optimization, it also assumes no relation between values.

[4]:
space = Space(a=1, b=Choice("aa", "bb", "cc")).sample(2, seed=1)
print(list(space))
[{'a': 1, 'b': 'bb'}, {'a': 1, 'b': 'aa'}]

Rand

Rand is the most common expression for a variable. It refers to sampling from a range of value.

Rand(low, high)

uniformly search between [low, high)

[5]:
samples = Rand(10.1, 20.2).generate_many(10000, seed=0)
pd.DataFrame(samples).hist();
../_images/notebooks_space_9_0.png

Rand(low, high, log=True)

search in the log space, but still in [low, high) so the smaller values get higher chance to be selected.

For log space searching, low must be greater or equal to 1.

The algorithm: exp(uniform(log(low), log(high)))

[6]:
samples = Rand(10.1, 1000, log=True).generate_many(10000, seed=0)
pd.DataFrame(samples).hist();
../_images/notebooks_space_11_0.png

Rand(low, high, q, include_high)

uniformly search between low and high with step q. include_high (default True) indicates whether the high value can be a candidate.

[7]:
print(Rand(-1.0,4.0,q=2.5).generate_many(10, seed=0))
print(Rand(-1.0,4.0,q=2.5,include_high=False).generate_many(10, seed=0))

samples = Rand(1.0,2.0,q=0.3).generate_many(10000, seed=0)
pd.DataFrame(samples).hist();
[1.5, 4.0, 1.5, 1.5, 1.5, 1.5, 1.5, 4.0, 4.0, 1.5]
[1.5, 1.5, 1.5, 1.5, -1.0, 1.5, -1.0, 1.5, 1.5, -1.0]
../_images/notebooks_space_13_1.png

Rand(low, high, q, include_high, log=True)

search between low and high with step q in log space. include_high (default True) indicates whether the high value can be a candidate.

[8]:
samples = Rand(1.0,16.0,q=5, log=True).generate_many(10000, seed=0)
pd.DataFrame(samples).hist()

samples = Rand(1.0,16.0,q=5, log=True, include_high=False).generate_many(10000, seed=0)
pd.DataFrame(samples).hist();
../_images/notebooks_space_15_0.png
../_images/notebooks_space_15_1.png

RandInt

RandInt can be considered as a special case of Rand where the low, high and q are all integers

RandInt(low, high, include_high)

[9]:
samples = RandInt(-2,2).generate_many(10000, seed=0)
pd.DataFrame(samples).hist()

samples = RandInt(-2,2,include_high=False).generate_many(10000, seed=0)
pd.DataFrame(samples).hist();
../_images/notebooks_space_17_0.png
../_images/notebooks_space_17_1.png

RandInt(low, high, include_high, q)

Search starting from low with step q to high

[10]:
samples = RandInt(-2,4,q=2).generate_many(10000, seed=0)
pd.DataFrame(samples).hist()

samples = RandInt(-2,4,include_high=False,q=2).generate_many(10000, seed=0)
pd.DataFrame(samples).hist();
../_images/notebooks_space_19_0.png
../_images/notebooks_space_19_1.png

RandInt(low, high, include_high, q, log)

Search starting from low with step q to high. The difference is it’s in log space, so lower values get higher chance.

Also for log searching space, low must be >=1

[11]:
samples = RandInt(1,7,q=2,log=True).generate_many(10000, seed=0)
pd.DataFrame(samples).hist()

samples = RandInt(1,7,include_high=False,q=2,log=True).generate_many(10000, seed=0)
pd.DataFrame(samples).hist();
../_images/notebooks_space_21_0.png
../_images/notebooks_space_21_1.png