tune.concepts

tune.concepts.checkpoint

class Checkpoint(fs)[source]

Bases: object

An abstraction for tuning checkpoint

Parameters: fs (fs.base.FS) – the file system

Attention

Normally you don’t need to create a checkpoint by yourself, please read Checkpoint Tutorial if you want to understand how it works.

create()[source]

Create a new checkpoint

Return type: tune.concepts.checkpoint.NewCheckpoint

property latest: fs.base.FS

latest checkpoint folder

Raises: AssertionError – if there was no checkpoint

class NewCheckpoint(checkpoint)[source]

Bases: object

A helper class for adding new checkpoints

Parameters: checkpoint (tune.concepts.checkpoint.Checkpoint) – the parent checkpoint

Attention

Do not construct this class directly, please read Checkpoint Tutorial for details

tune.concepts.dataset

class StudyResult(dataset, result)[source]

Bases: object

A collection of the input TuneDataset and the tuning result

Parameters

dataset (tune.concepts.dataset.TuneDataset) – input dataset for tuning
result (fugue.workflow.workflow.WorkflowDataFrame) – tuning result as a dataframe

Attention

Do not construct this class directly.

next_tune_dataset(best_n=0)[source]

Convert the result back to a new TuneDataset to be used by the next steps.

Parameters: best_n (int) – top n result to extract, defaults to 0 (entire result)
Returns: a new dataset for tuning
Return type: tune.concepts.dataset.TuneDataset

result(best_n=0)[source]

Get the top n results sorted by tune.concepts.flow.report.TrialReport.sort_metric()

Parameters: best_n (int) – number of result to get, defaults to 0. if <=0 then it will return the entire result
Returns: result subset
Return type: fugue.workflow.workflow.WorkflowDataFrame

union_with(other)[source]

Union with another result set and update itself

Parameters: other (tune.concepts.dataset.StudyResult) – the other result dataset
Return type: None

Note

This method also removes duplicated reports based on tune.concepts.flow.trial.Trial.trial_id(). Each trial will have only the best report in the updated result

class TuneDataset(data, dfs, keys)[source]

Bases: object

A Fugue WorkflowDataFrame with metadata representing all dataframes required for a tuning task.

Parameters

data (fugue.workflow.workflow.WorkflowDataFrame) – the Fugue WorkflowDataFrame containing all required dataframes
dfs (List[str]) – the names of the dataframes
keys (List[str]) – the common partition keys of all dataframes

Attention

Do not construct this class directly, please read TuneDataset Tutorial to find the right way

property data: fugue.workflow.workflow.WorkflowDataFrame: the Fugue WorkflowDataFrame containing all required dataframes

property dfs: List[str]: All dataframe names (you can also find them part of the column names of data() )

property keys: List[str]: Partition keys (columns) of data()

split(weights, seed)[source]

Split the dataset randomly to small partitions. This is useful for some algorithms such as Hyperband, because it needs different subset to run successive halvings with different parameters.

Parameters

weights (List[float]) – a list of numeric values. The length represents the number of splitd partitions, and the values represents the proportion of each partition
seed (Any) – random seed for the split

Returns

a list of sub-datasets

Return type

List[tune.concepts.dataset.TuneDataset]

# randomly split the data to two partitions 25% and 75%
dataset.split([1, 3], seed=0)
# same because weights will be normalized
dataset.split([10, 30], seed=0)

class TuneDatasetBuilder(space, path='')[source]

Bases: object

Builder of TuneDataset, for details please read TuneDataset Tutorial

Parameters

space (tune.concepts.space.spaces.Space) – searching space, see Space Tutorial
path (str) – temp path to store searialized dataframe partitions , defaults to “”

add_df(name, df, how='')[source]

Add a dataframe to the dataset

Parameters

name (str) – name of the dataframe, it will also create a __tune_df__<name> column in the dataset dataframe
df (fugue.workflow.workflow.WorkflowDataFrame) – the dataframe to add.
how (str) – join type, can accept semi, left_semi, anti, left_anti, inner, left_outer, right_outer, full_outer, cross

Returns

the builder itself

Return type

tune.concepts.dataset.TuneDatasetBuilder

Note

For the first dataframe you add, how should be empty. From the second dataframe you add, how must be set.

Note

If df is prepartitioned, the partition key will be used to join with the added dataframes. Read TuneDataset Tutorial for more details

add_dfs(dfs, how='')[source]

Add multiple dataframes with the same join type

Parameters

dfs (fugue.workflow.workflow.WorkflowDataFrames) – dictionary like dataframe collection. The keys will be used as the dataframe names
how (str) – join type, can accept semi, left_semi, anti, left_anti, inner, left_outer, right_outer, full_outer, cross

Returns

the builder itself

Return type

tune.concepts.dataset.TuneDatasetBuilder

build(wf, batch_size=1, shuffle=True, trial_metadata=None)[source]

Build TuneDataset, for details please read TuneDataset Tutorial

Parameters

wf (fugue.workflow.workflow.FugueWorkflow) – the workflow associated with the dataset
batch_size (int) – how many configurations as a batch, defaults to 1
shuffle (bool) – whether to shuffle the entire dataset, defaults to True. This is to make the tuning process more even, it will look better. It should have slight benefit on speed, no effect on result.
trial_metadata (Optional[Dict[str, Any]]) – metadata to pass to each Trial, defaults to None

Returns

the dataset for tuning

Return type

tune.concepts.dataset.TuneDataset