tune.concepts
tune.concepts.checkpoint
- class Checkpoint(fs)[source]
Bases:
object
An abstraction for tuning checkpoint
- Parameters
fs (fs.base.FS) – the file system
Attention
Normally you don’t need to create a checkpoint by yourself, please read Checkpoint Tutorial if you want to understand how it works.
- property latest: fs.base.FS
latest checkpoint folder
- Raises
AssertionError – if there was no checkpoint
- class NewCheckpoint(checkpoint)[source]
Bases:
object
A helper class for adding new checkpoints
- Parameters
checkpoint (tune.concepts.checkpoint.Checkpoint) – the parent checkpoint
Attention
Do not construct this class directly, please read Checkpoint Tutorial for details
tune.concepts.dataset
- class StudyResult(dataset, result)[source]
Bases:
object
A collection of the input
TuneDataset
and the tuning result- Parameters
dataset (tune.concepts.dataset.TuneDataset) – input dataset for tuning
result (fugue.workflow.workflow.WorkflowDataFrame) – tuning result as a dataframe
Attention
Do not construct this class directly.
- next_tune_dataset(best_n=0)[source]
Convert the result back to a new
TuneDataset
to be used by the next steps.- Parameters
best_n (int) – top n result to extract, defaults to 0 (entire result)
- Returns
a new dataset for tuning
- Return type
- result(best_n=0)[source]
Get the top n results sorted by
tune.concepts.flow.report.TrialReport.sort_metric()
- Parameters
best_n (int) – number of result to get, defaults to 0. if <=0 then it will return the entire result
- Returns
result subset
- Return type
- union_with(other)[source]
Union with another result set and update itself
- Parameters
other (tune.concepts.dataset.StudyResult) – the other result dataset
- Return type
None
Note
This method also removes duplicated reports based on
tune.concepts.flow.trial.Trial.trial_id()
. Each trial will have only the best report in the updated result
- class TuneDataset(data, dfs, keys)[source]
Bases:
object
A Fugue
WorkflowDataFrame
with metadata representing all dataframes required for a tuning task.- Parameters
data (fugue.workflow.workflow.WorkflowDataFrame) – the Fugue
WorkflowDataFrame
containing all required dataframesdfs (List[str]) – the names of the dataframes
keys (List[str]) – the common partition keys of all dataframes
Attention
Do not construct this class directly, please read TuneDataset Tutorial to find the right way
- property data: fugue.workflow.workflow.WorkflowDataFrame
the Fugue
WorkflowDataFrame
containing all required dataframes
- property dfs: List[str]
All dataframe names (you can also find them part of the column names of
data()
)
- split(weights, seed)[source]
Split the dataset randomly to small partitions. This is useful for some algorithms such as Hyperband, because it needs different subset to run successive halvings with different parameters.
- Parameters
weights (List[float]) – a list of numeric values. The length represents the number of splitd partitions, and the values represents the proportion of each partition
seed (Any) – random seed for the split
- Returns
a list of sub-datasets
- Return type
# randomly split the data to two partitions 25% and 75% dataset.split([1, 3], seed=0) # same because weights will be normalized dataset.split([10, 30], seed=0)
- class TuneDatasetBuilder(space, path='')[source]
Bases:
object
Builder of
TuneDataset
, for details please read TuneDataset Tutorial- Parameters
space (tune.concepts.space.spaces.Space) – searching space, see Space Tutorial
path (str) – temp path to store searialized dataframe partitions , defaults to “”
- add_df(name, df, how='')[source]
Add a dataframe to the dataset
- Parameters
name (str) – name of the dataframe, it will also create a
__tune_df__<name>
column in the dataset dataframedf (fugue.workflow.workflow.WorkflowDataFrame) – the dataframe to add.
how (str) – join type, can accept
semi
,left_semi
,anti
,left_anti
,inner
,left_outer
,right_outer
,full_outer
,cross
- Returns
the builder itself
- Return type
Note
For the first dataframe you add,
how
should be empty. From the second dataframe you add,how
must be set.Note
If
df
is prepartitioned, the partition key will be used to join with the added dataframes. Read TuneDataset Tutorial for more details
- add_dfs(dfs, how='')[source]
Add multiple dataframes with the same join type
- Parameters
dfs (fugue.workflow.workflow.WorkflowDataFrames) – dictionary like dataframe collection. The keys will be used as the dataframe names
how (str) – join type, can accept
semi
,left_semi
,anti
,left_anti
,inner
,left_outer
,right_outer
,full_outer
,cross
- Returns
the builder itself
- Return type
- build(wf, batch_size=1, shuffle=True, trial_metadata=None)[source]
Build
TuneDataset
, for details please read TuneDataset Tutorial- Parameters
wf (fugue.workflow.workflow.FugueWorkflow) – the workflow associated with the dataset
batch_size (int) – how many configurations as a batch, defaults to 1
shuffle (bool) – whether to shuffle the entire dataset, defaults to True. This is to make the tuning process more even, it will look better. It should have slight benefit on speed, no effect on result.
trial_metadata (Optional[Dict[str, Any]]) – metadata to pass to each
Trial
, defaults to None
- Returns
the dataset for tuning
- Return type