Search Space

THIS IS THE MOST IMPORTANT CONCEPT OF TUNE, MUST READ

Tune defines its own searching space concept and different expressions. It inherits the Fugue philosophy: one expression for all frameworks. For the underlying optimizers (e.g. HyperOpt, Optuna), tune unifies the behaviors. For example Rand(1.0, 5.0, q=1.5) will uniformly search on [1.0 , 2.5, 4.0] no matter you use HyperOpt or Optuna as the underlying optimizer.

In Tune, spaces are predefined before search, it is opposite to Optuna where you get variables inside objectives during runtime. In this way, your space definition is totally separated from objective definition, and your objectives may be just simple python functions independent from Tune.

[1]:

from tune import Space, Grid, Rand, RandInt, Choice
import pandas as pd

Simple Cases

The simplest cases are spaces with only static variables. So the spaces will always generate single configuration.

[2]:

space = Space(a=1, b=1)
print(list(space))

[{'a': 1, 'b': 1}]

Grid Search

You can replace the static variables to Grid expression. We will cross product all grid expressions in the space, so you see in the second example, it generates 6 configurations.

[3]:

print(list(Space(a=1, b=Grid("a","b"))))
print(list(Space(a=Grid(1,2), b=Grid("x","y","z"))))

[{'a': 1, 'b': 'a'}, {'a': 1, 'b': 'b'}]
[{'a': 1, 'b': 'x'}, {'a': 1, 'b': 'y'}, {'a': 1, 'b': 'z'}, {'a': 2, 'b': 'x'}, {'a': 2, 'b': 'y'}, {'a': 2, 'b': 'z'}]

Random Expressions

Random search requires .sample method after you define the original space to specify how many random combinations you want to draw from the expression.

Choice

Choice refers to discrete unordered set of values. So Choice(1, 2, 3) is equivalent to Choide(2, 1, 3). When you do random sampling from Choice, every value has equal chance. When you do advanced search such as Bayesian Optimization, it also assumes no relation between values.

[4]:

space = Space(a=1, b=Choice("aa", "bb", "cc")).sample(2, seed=1)
print(list(space))

[{'a': 1, 'b': 'bb'}, {'a': 1, 'b': 'aa'}]

Rand

Rand is the most common expression for a variable. It refers to sampling from a range of value.

Rand(low, high)

uniformly search between [low, high)

[5]:

samples = Rand(10.1, 20.2).generate_many(10000, seed=0)
pd.DataFrame(samples).hist();

Rand(low, high, log=True)

search in the log space, but still in [low, high) so the smaller values get higher chance to be selected.

For log space searching, low must be greater or equal to 1.

The algorithm: exp(uniform(log(low), log(high)))

[6]:

samples = Rand(10.1, 1000, log=True).generate_many(10000, seed=0)
pd.DataFrame(samples).hist();

Rand(low, high, q, include_high)

uniformly search between low and high with step q. include_high (default True) indicates whether the high value can be a candidate.

[7]:

print(Rand(-1.0,4.0,q=2.5).generate_many(10, seed=0))
print(Rand(-1.0,4.0,q=2.5,include_high=False).generate_many(10, seed=0))

samples = Rand(1.0,2.0,q=0.3).generate_many(10000, seed=0)
pd.DataFrame(samples).hist();

[1.5, 4.0, 1.5, 1.5, 1.5, 1.5, 1.5, 4.0, 4.0, 1.5]
[1.5, 1.5, 1.5, 1.5, -1.0, 1.5, -1.0, 1.5, 1.5, -1.0]

Rand(low, high, q, include_high, log=True)

search between low and high with step q in log space. include_high (default True) indicates whether the high value can be a candidate.

[8]:

samples = Rand(1.0,16.0,q=5, log=True).generate_many(10000, seed=0)
pd.DataFrame(samples).hist()

samples = Rand(1.0,16.0,q=5, log=True, include_high=False).generate_many(10000, seed=0)
pd.DataFrame(samples).hist();

RandInt

RandInt can be considered as a special case of Rand where the low, high and q are all integers

RandInt(low, high, include_high)

[9]:

samples = RandInt(-2,2).generate_many(10000, seed=0)
pd.DataFrame(samples).hist()

samples = RandInt(-2,2,include_high=False).generate_many(10000, seed=0)
pd.DataFrame(samples).hist();

RandInt(low, high, include_high, q)

Search starting from low with step q to high

[10]:

samples = RandInt(-2,4,q=2).generate_many(10000, seed=0)
pd.DataFrame(samples).hist()

samples = RandInt(-2,4,include_high=False,q=2).generate_many(10000, seed=0)
pd.DataFrame(samples).hist();

RandInt(low, high, include_high, q, log)

Search starting from low with step q to high. The difference is it’s in log space, so lower values get higher chance.

Also for log searching space, low must be >=1

[11]:

samples = RandInt(1,7,q=2,log=True).generate_many(10000, seed=0)
pd.DataFrame(samples).hist()

samples = RandInt(1,7,include_high=False,q=2,log=True).generate_many(10000, seed=0)
pd.DataFrame(samples).hist();

Random Search

In Tune, you have two options to search on random expressions

As Level 1 Search

Level 1 means before execution. So given a combination of random expressions, we draw certain number of parameter combinations before execution. So the system will only deal with static parameters during runtime.

Grid search is also Level 1 search, and Level 1 search determines max parallelism. To also treat random expressions as Level 1, we must use .sample

[12]:

space = Space(a=Rand(0,1), b=Choice("x", "y")).sample(10, seed=0)
list(space)

[12]:

[{'a': 0.5488135039273248, 'b': 'x'},
 {'a': 0.7151893663724195, 'b': 'y'},
 {'a': 0.6027633760716439, 'b': 'y'},
 {'a': 0.5448831829968969, 'b': 'x'},
 {'a': 0.4236547993389047, 'b': 'x'},
 {'a': 0.6458941130666561, 'b': 'y'},
 {'a': 0.4375872112626925, 'b': 'y'},
 {'a': 0.8917730007820798, 'b': 'y'},
 {'a': 0.9636627605010293, 'b': 'y'},
 {'a': 0.3834415188257777, 'b': 'x'}]

If in space, you have both grid and random expressions, .sample will only apply to random samples, and then cross product with all grid combinations

[13]:

space = Space(a=Grid(0,1), b=Rand(0,1), c=Grid("a", "b"), d=Rand(0,1)).sample(3, seed=1)
list(space)  # 2*2 *3 configs

[13]:

[{'a': 0, 'b': 0.417022004702574, 'c': 'a', 'd': 0.30233257263183977},
 {'a': 0, 'b': 0.417022004702574, 'c': 'b', 'd': 0.30233257263183977},
 {'a': 1, 'b': 0.417022004702574, 'c': 'a', 'd': 0.30233257263183977},
 {'a': 1, 'b': 0.417022004702574, 'c': 'b', 'd': 0.30233257263183977},
 {'a': 0, 'b': 0.7203244934421581, 'c': 'a', 'd': 0.14675589081711304},
 {'a': 0, 'b': 0.7203244934421581, 'c': 'b', 'd': 0.14675589081711304},
 {'a': 1, 'b': 0.7203244934421581, 'c': 'a', 'd': 0.14675589081711304},
 {'a': 1, 'b': 0.7203244934421581, 'c': 'b', 'd': 0.14675589081711304},
 {'a': 0, 'b': 0.00011437481734488664, 'c': 'a', 'd': 0.0923385947687978},
 {'a': 0, 'b': 0.00011437481734488664, 'c': 'b', 'd': 0.0923385947687978},
 {'a': 1, 'b': 0.00011437481734488664, 'c': 'a', 'd': 0.0923385947687978},
 {'a': 1, 'b': 0.00011437481734488664, 'c': 'b', 'd': 0.0923385947687978}]

As Level 2 Search

Level 2 search happens during runtime, and bases on each level 1 search candidate. A common scenario is that we want to do grid search on one parameter, and do Bayesian Optimization on another parameter. Then we can parallelize on the choices of the first parameter and do sequential Bayesian Optimization on the second parameter.

We will use 3rd party solutions for Level 2 search, such as HyperOpt and Optuna. To pass random expression to Level 2, we simply don’t use .sample

[14]:

space = Space(a=Grid(0,1), b=Rand(0,1), c=Grid("a", "b"), d=Rand(0,1))
list(space)  # 2*2 configs, each of the config still contains the Rand expression

[14]:

[{'a': 0, 'b': Rand(low=0, high=1, q=None, log=False, include_high=True), 'c': 'a', 'd': Rand(low=0, high=1, q=None, log=False, include_high=True)},
 {'a': 0, 'b': Rand(low=0, high=1, q=None, log=False, include_high=True), 'c': 'b', 'd': Rand(low=0, high=1, q=None, log=False, include_high=True)},
 {'a': 1, 'b': Rand(low=0, high=1, q=None, log=False, include_high=True), 'c': 'a', 'd': Rand(low=0, high=1, q=None, log=False, include_high=True)},
 {'a': 1, 'b': Rand(low=0, high=1, q=None, log=False, include_high=True), 'c': 'b', 'd': Rand(low=0, high=1, q=None, log=False, include_high=True)}]

Space Operations, Conditional Search and Hybrid Search

Almost all popular tuning frameworks support conditional search. Tune approaches conditional search in a totally different way.

Instead using if-else at runtime or using nested dictionaries to represent conditions, we introduce space operations:

[15]:

space1 = Space(a=1, b=Grid(2,3))
space2 = Space(c=Grid("a","b"))

union_space = space1 + space2
print(list(union_space))

product_space = space1 * space2
print(list(product_space))

[{'a': 1, 'b': 2}, {'a': 1, 'b': 3}, {'c': 'a'}, {'c': 'b'}]
[{'a': 1, 'b': 2, 'c': 'a'}, {'a': 1, 'b': 2, 'c': 'b'}, {'a': 1, 'b': 3, 'c': 'a'}, {'a': 1, 'b': 3, 'c': 'b'}]

Operator + will union the configurations from two spaces, it can solve most of the conditional search problems

Operator * will cross product the configurations from two spaces, it can solve most of the hybrid search problems

Conditional Search

[16]:

space1 = Space(model="LogisticRegression")
space2 = Space(model="RandomForestClassifier", max_depth=Grid(3,4))
space3 = Space(model="XGBClassifier", n_estimators=Grid(10,100,1000))

sweep = sum([space1, space2, space3])  # sum is another way to union
list(sweep)

[16]:

[{'model': 'LogisticRegression'},
 {'model': 'RandomForestClassifier', 'max_depth': 3},
 {'model': 'RandomForestClassifier', 'max_depth': 4},
 {'model': 'XGBClassifier', 'n_estimators': 10},
 {'model': 'XGBClassifier', 'n_estimators': 100},
 {'model': 'XGBClassifier', 'n_estimators': 1000}]

All 3 models have a parameter random_state, we want also want to do a grid search on it for every model. We just use *

[17]:

sweep_with_random_state = sweep * Space(random_state=Grid(0,1))
list(sweep_with_random_state)

[17]:

[{'model': 'LogisticRegression', 'random_state': 0},
 {'model': 'LogisticRegression', 'random_state': 1},
 {'model': 'RandomForestClassifier', 'max_depth': 3, 'random_state': 0},
 {'model': 'RandomForestClassifier', 'max_depth': 3, 'random_state': 1},
 {'model': 'RandomForestClassifier', 'max_depth': 4, 'random_state': 0},
 {'model': 'RandomForestClassifier', 'max_depth': 4, 'random_state': 1},
 {'model': 'XGBClassifier', 'n_estimators': 10, 'random_state': 0},
 {'model': 'XGBClassifier', 'n_estimators': 10, 'random_state': 1},
 {'model': 'XGBClassifier', 'n_estimators': 100, 'random_state': 0},
 {'model': 'XGBClassifier', 'n_estimators': 100, 'random_state': 1},
 {'model': 'XGBClassifier', 'n_estimators': 1000, 'random_state': 0},
 {'model': 'XGBClassifier', 'n_estimators': 1000, 'random_state': 1}]

Hybrid Search (Grid + Random + Bayesian Optimization)

For XGBClassifier, we want to do a hybrid search: grid search on random_state, random search on n_estimators and Level 2 (Bayesian Optimization) search on learning_rate

[18]:

xgb = Space(model="XGBClassifier", learning_rate=Rand(0,1), random_state=Grid(0,1)) * Space(n_estimators=RandInt(10,1000)).sample(3, seed=0)
list(xgb)

[18]:

[{'model': 'XGBClassifier', 'learning_rate': Rand(low=0, high=1, q=None, log=False, include_high=True), 'random_state': 0, 'n_estimators': 553},
 {'model': 'XGBClassifier', 'learning_rate': Rand(low=0, high=1, q=None, log=False, include_high=True), 'random_state': 0, 'n_estimators': 718},
 {'model': 'XGBClassifier', 'learning_rate': Rand(low=0, high=1, q=None, log=False, include_high=True), 'random_state': 0, 'n_estimators': 607},
 {'model': 'XGBClassifier', 'learning_rate': Rand(low=0, high=1, q=None, log=False, include_high=True), 'random_state': 1, 'n_estimators': 553},
 {'model': 'XGBClassifier', 'learning_rate': Rand(low=0, high=1, q=None, log=False, include_high=True), 'random_state': 1, 'n_estimators': 718},
 {'model': 'XGBClassifier', 'learning_rate': Rand(low=0, high=1, q=None, log=False, include_high=True), 'random_state': 1, 'n_estimators': 607}]

Hybrid search and conditional search can also be used together

[19]:

list(Space(model="LogisticRegression")+xgb)

[19]:

[{'model': 'LogisticRegression'},
 {'model': 'XGBClassifier', 'learning_rate': Rand(low=0, high=1, q=None, log=False, include_high=True), 'random_state': 0, 'n_estimators': 553},
 {'model': 'XGBClassifier', 'learning_rate': Rand(low=0, high=1, q=None, log=False, include_high=True), 'random_state': 0, 'n_estimators': 718},
 {'model': 'XGBClassifier', 'learning_rate': Rand(low=0, high=1, q=None, log=False, include_high=True), 'random_state': 0, 'n_estimators': 607},
 {'model': 'XGBClassifier', 'learning_rate': Rand(low=0, high=1, q=None, log=False, include_high=True), 'random_state': 1, 'n_estimators': 553},
 {'model': 'XGBClassifier', 'learning_rate': Rand(low=0, high=1, q=None, log=False, include_high=True), 'random_state': 1, 'n_estimators': 718},
 {'model': 'XGBClassifier', 'learning_rate': Rand(low=0, high=1, q=None, log=False, include_high=True), 'random_state': 1, 'n_estimators': 607}]

[ ]: