core

config

class flatiron.core.config.BaseConfig(**data)[source]

Bases: BaseModel

_abc_impl = <_abc._abc_data object>
model_config: ClassVar[ConfigDict] = {'extra': 'forbid'}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class flatiron.core.config.CallbacksConfig(**data)[source]

Bases: BaseConfig

Configuration for callbacks.

See: https://thenewflesh.github.io/flatiron/core.html#module-flatiron.core.tools See: https://www.tensorflow.org/api_docs/python/tf/keras/callbacks/ModelCheckpoint

project

Name of project.

Type:

str

root

Tensorboard parent directory. Default: /mnt/storage.

Type:

str or Path

monitor

Metric to monitor. Default: ‘val_loss’.

Type:

str, optional

verbose

Log callback actions. Default: 0.

Type:

int, optional

save_best_only

Save only best model. Default: False.

Type:

bool, optional

mode

Overwrite best model via mode(old metric, new metric). Options: [auto, min, max]. Default: ‘auto’.

Type:

str, optional

save_weights_only

Only save model weights. Default: False.

Type:

bool, optional

save_freq

Save after each epoch or N batches. Options: ‘epoch’ or int. Default: ‘epoch’.

Type:

union, optional

initial_value_threshold

Initial best value of metric. Default: None.

Type:

float, optional

_abc_impl = <_abc._abc_data object>
initial_value_threshold: Optional[float]
mode: Annotated[str]
model_config: ClassVar[ConfigDict] = {'extra': 'forbid'}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

monitor: str
project: str
root: str
save_best_only: bool
save_freq: Union[str, int]
save_weights_only: bool
verbose: int
class flatiron.core.config.DatasetConfig(**data)[source]

Bases: BaseConfig

Configuration for Dataset.

See: https://thenewflesh.github.io/flatiron/core.html#module-flatiron.core.dataset

source

Dataset directory or CSV filepath.

Type:

str

ext_regex

File extension pattern. Default: ‘npy|exr|png|jpeg|jpg|tiff’.

Type:

str, optional

labels

Label channels. Default: None.

Type:

object, optional

label_axis

Label axis. Default: -1.

Type:

int, optional

test_size

Test set size as a proportion. Default: 0.2.

Type:

float, optional

limit

Limit data by number of samples. Default: None.

Type:

str or int

reshape

Reshape concatenated data to incorpate frames as the first dimension: (FRAME, …). Analogous to the first dimension being batch. Default: True.

Type:

bool, optional

shuffle

Randomize data before splitting. Default: True.

Type:

bool, optional

seed

Shuffle seed number. Default: None.

Type:

int, optional

_abc_impl = <_abc._abc_data object>
ext_regex: str
label_axis: int
labels: Union[int, str, list[int], list[str], None]
limit: Optional[Annotated[int]]
model_config: ClassVar[ConfigDict] = {'extra': 'forbid'}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

reshape: bool
seed: Optional[int]
shuffle: bool
source: str
test_size: Optional[Annotated[float]]
class flatiron.core.config.FrameworkConfig(**data)[source]

Bases: BaseModel

_abc_impl = <_abc._abc_data object>
device: str
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

name: Annotated[str]
class flatiron.core.config.LoggerConfig(**data)[source]

Bases: BaseConfig

Configuration for logger.

See: https://thenewflesh.github.io/flatiron/core.html#module-flatiron.core.logging

slack_channel

Slack channel name. Default: None.

Type:

str, optional

slack_url

Slack URL name. Default: None.

Type:

str, optional

slack_methods

Pipeline methods to be logged to Slack. Default: [load, compile, train].

Type:

list[str], optional

timezone

Timezone. Default: UTC.

Type:

str, optional

level

Log level. Default: warn.

Type:

str or int, optional

_abc_impl = <_abc._abc_data object>
classmethod _validate_slack_methods(value)[source]
level: str
model_config: ClassVar[ConfigDict] = {'extra': 'forbid'}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

slack_channel: Optional[str]
slack_methods: list[str]
slack_url: Optional[str]
timezone: str
class flatiron.core.config.LossConfig(**data)[source]

Bases: BaseModel

Configuration for loss.

name

Name of loss. Default=’MeanSquaredError’.

Type:

string, optional

_abc_impl = <_abc._abc_data object>
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

name: str
class flatiron.core.config.OptimizerConfig(**data)[source]

Bases: BaseModel

Configuration for optimizer.

See: https://www.tensorflow.org/api_docs/python/tf/keras/optimizers/Optimizer

name

Name of optimizer. Default=’SGD’.

Type:

string, optional

_abc_impl = <_abc._abc_data object>
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

name: str
class flatiron.core.config.PipelineConfig(**data)[source]

Bases: BaseConfig

Configuration for PipelineBase classes.

See: https://thenewflesh.github.io/flatiron/core.html#module-flatiron.core.pipeline

framework

Deep learning framework config.

Type:

dict

dataset

Dataset configuration.

Type:

dict

optimizer

Optimizer configuration.

Type:

dict

loss

Loss configuration.

Type:

dict

metrics

Metric dicts. Default=[dict(name=’Mean’)].

Type:

list[dict], optional

compile

Compile configuration.

Type:

dict

callbacks

Callbacks configuration.

Type:

dict

logger

Logger configuration.

Type:

dict

train

Train configuration.

Type:

dict

_abc_impl = <_abc._abc_data object>
classmethod _validate_metrics(items)[source]
callbacks: CallbacksConfig
dataset: DatasetConfig
framework: FrameworkConfig
logger: LoggerConfig
loss: LossConfig
metrics: list[dict[str, Any]]
model_config: ClassVar[ConfigDict] = {'extra': 'forbid'}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

optimizer: OptimizerConfig
train: TrainConfig
class flatiron.core.config.TrainConfig(**data)[source]

Bases: BaseConfig

Configuration for calls to model train function.

See: https://www.tensorflow.org/api_docs/python/tf/keras/Model#fit

batch_size

Number of samples per update. Default: 32.

Type:

int, optional

epochs

Number of epochs to train model. Default: 30.

Type:

int, optional

verbose

Verbosity of model logging. Options: ‘auto’, 0, 1, 2. 0 is silent. 1 is progress bar. 2 is one line per epoch. Auto is usually 1. Default: auto.

Type:

str or int, optional

validation_split

Fraction of training data to use for validation. Default: 0.

Type:

float, optional

seed

Seed value. Default: 42.

Type:

int, optional

shuffle

Shuffle training data per epoch. Default: True.

Type:

bool, optional

initial_epoch

Epoch at which to start training (useful for resuming a previous training run). Default: 1.

Type:

int, optional

validation_freq

Number of training epochs before new validation. Default: 1.

Type:

int, optional

_abc_impl = <_abc._abc_data object>
batch_size: int
epochs: int
initial_epoch: int
model_config: ClassVar[ConfigDict] = {'extra': 'forbid'}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

seed: int
shuffle: bool
validation_freq: int
validation_split: float
verbose: Union[str, int]

dataset

class flatiron.core.dataset.Dataset(info, ext_regex='npy|exr|png|jpeg|jpg|tiff', calc_file_size=True, labels=None, label_axis=-1)[source]

Bases: object

__getitem(frame)

Get data by frame. Thisi is needed to avoid recursion errors when overloading __getitem__.

Raises:

IndexError – If frame is missing or multiple frames were found.

Returns:

Data of given frame.

Return type:

object

__init__(info, ext_regex='npy|exr|png|jpeg|jpg|tiff', calc_file_size=True, labels=None, label_axis=-1)[source]

Construct a Dataset instance. If labels is an integer it will assumed to be an axis which the data will be split upon.

Parameters:
  • info (pd.DataFrame) – Info DataFrame.

  • ext_regex (str, optional) – File extension pattern. Default: ‘npy|exr|png|jpeg|jpg|tiff’.

  • calc_file_size (bool, optional) – Calculate file size in GB. Default: True.

  • labels (object, optional) – Label channels. Default: None.

  • label_axis (int, optional) – Label axis. Default: -1.

Raises:
  • EnforceError – If info is not an instance of DataFrame.

  • EnforceError – If required columns not found in info.

static _get_stats(info)[source]

Creates table of statistics from given info DataFrame.

Parameters:

info (pd.DataFrame) – Info DataFrame.

Returns:

Stats DataFrame.

Return type:

pd.DataFrame

_read_file(filepath)[source]

Read given file.

Parameters:

filepath (str) – Filepath.

Raises:

IOError – If extension is not supported.

Returns:

File content.

Return type:

object

_read_file_as_array(filepath)[source]

Read file as numpy array.

Parameters:

filepath (str) – Filepath.

Returns:

Array.

Return type:

np.ndarray

static _resolve_limit(limit)[source]

Resolves a given limit into a number of samples and limit type.

Parameters:

limit (str, int, None) – Limit descriptor.

Returns:

Number of samples and limit type.

Return type:

tuple[int, str]

property asset_name: str

Returns: str: Asset name of Dataset.

property asset_path: str

Returns: str: Asset path of Dataset.

property filepaths: list[str]

Returns: list[str]: Filepaths sorted by frame.

get_arrays(frame)[source]

Get data and convert into numpy arrays according to labels.

Parameters:

frame (int) – Frame.

Raises:

IndexError – If frame is missing or multiple frames were found.

Returns:

List of arrays from the given frame.

Return type:

list[np.ndarray]

get_filepath(frame)[source]

Get filepath of given frame.

Raises:

IndexError – If frame is missing or multiple frames were found.

Returns:

Filepath of given frame.

Return type:

str

property info: DataFrame

Returns: DataFrame: Copy of info DataFrame.

load(limit=None, shuffle=False, reshape=True)[source]

Load data from files.

Parameters:
  • limit (str or int, optional) – Limit data by number of samples or memory size. Default: None.

  • shuffle (bool, optional) – Shuffle frames before loading. Default: False.

  • reshape (bool, optional) – Reshape concatenated data to incorpate frames as the first dimension: (FRAME, …). Analogous to the first dimension being batch. Default: True.

Returns:

self.

Return type:

Dataset

classmethod read_csv(filepath, **kwargs)[source]

Construct Dataset instance from given csv filepath.

Parameters:

filepath (str or Path) – Info CSV filepath.

Raises:

EnforceError – If filepath does not exist or is not a CSV.

Returns:

Dataset instance.

Return type:

Dataset

classmethod read_directory(directory, **kwargs)[source]

Construct dataset from directory.

Parameters:

directory (str or Path) – Dataset directory.

Raises:
  • EnforceError – If directory does not exist.

  • EnforceError – If more or less than 1 CSV file found in directory.

Returns:

Dataset instance.

Return type:

Dataset

property stats: DataFrame

Generates a table of statistics of info data.

Metrics include:

  • min

  • max

  • mean

  • std

  • loaded

  • total

Units include:

  • gb

  • frame

  • sample

Returns:

Table of statistics.

Return type:

DataFrame

train_test_split(test_size=0.2, limit=None, shuffle=True, seed=None)[source]

Split into train and test Datasets.

Parameters:
  • test_size (float, optional) – Test set size as a proportion. Default: 0.2.

  • limit (int, optional) – Limit the total length of train and test. Default: None.

  • shuffle (bool, optional) – Randomize data before splitting. Default: True.

  • seed (float, optional) – Seed number between 0 and 1. Default: None.

Returns:

Train Dataset, Test Dataset.

Return type:

tuple[Dataset]

unload()[source]

Delete self.data and reset self.info.

Returns:

self.

Return type:

Dataset

xy_split()[source]

Split data into x and y arrays, according to self.labels as the split index and self.label_axis as the split axis.

Raises:
  • EnforceError – If data has not been loaded.

  • EnforceError – If self.labels is not a list of a single integer.

Returns:

x and y arrays.

Return type:

tuple[np.ndarray]

logging

class flatiron.core.logging.SlackLogger(message, config, slack_channel=None, slack_url=None, timezone='UTC', level='warn', **kwargs)[source]

Bases: LogRuntime

SlackLogger is a class for logging information to stdout and Slack.

__init__(message, config, slack_channel=None, slack_url=None, timezone='UTC', level='warn', **kwargs)[source]

SlackLogger is a class for logging information to stdout and Slack.

If slack_url and slack_channel are specified, SlackLogger will attempt to log custom formatted output to Slack.

Parameters:
  • message (str) – Log message or Slack title.

  • config (dict) – Config dict.

  • slack_channel (str, optional) – Slack channel name. Default: None.

  • slack_url (str, optional) – Slack URL name. Default: None.

  • timezone (str, optional) – Timezone. Default: UTC.

  • level (str or int, optional) – Log level. Default: warn.

  • **kwargs (optional) – LogRuntime kwargs.

multidataset

class flatiron.core.multidataset.MultiDataset(datasets)[source]

Bases: object

This class combines a dictionary of Dataset instances into a single dataset. Datasets are merged by frame.

__init__(datasets)[source]

Constructs a MultiDataset instance.

Parameters:

datasets (dict[str, Dataset]) – Dictionary of Dataset instances.

get_arrays(frame)[source]

For each dataset, get data and convert into numpy arrays according to labels.

Parameters:

frame (int) – Frame.

Raises:

IndexError – If frame is missing or multiple frames were found.

Returns:

Dict where values are lists of arrays from the given frame.

Return type:

dict

get_filepaths(frame)[source]

For each dataset, get filepath of given frame.

Returns:

Dict where values are filepaths of the given frame.

Return type:

dict

property info: DataFrame

Returns: DataFrame: Copy of info DataFrame.

load(limit=None, reshape=True)[source]

For each dataset, load data from files.

Parameters:
  • limit (str or int, optional) – Limit data by number of samples or memory size. Default: None.

  • reshape (bool, optional) – Reshape concatenated data to incorpate frames as the first dimension: (FRAME, …). Analogous to the first dimension being batch. Default: True.

Returns:

self.

Return type:

MultiDataset

train_test_split(test_size=0.2, limit=None, shuffle=True, seed=None)[source]

Split into train and test MultiDatasets.

Parameters:
  • test_size (float, optional) – Test set size as a proportion. Default: 0.2.

  • limit (int, optional) – Limit the total length of train and test. Default: None.

  • shuffle (bool, optional) – Randomize data before splitting. Default: True.

  • seed (float, optional) – Seed number between 0 and 1. Default: None.

Returns:

Train MultiDataset, Test MultiDataset.

Return type:

tuple[MultiDataset]

unload()[source]

For each dataset, delete self.data and reset self.info.

Returns:

self.

Return type:

MultiDataset

xy_split()[source]

For each dataset, split data into x and y arrays, according to self.labels as the split index and self.label_axis as the split axis.

Raises:
  • EnforceError – If data has not been loaded.

  • EnforceError – If self.labels is not a list of a single integer.

Returns:

Dict where values are x and y arrays.

Return type:

dict

pipeline

class flatiron.core.pipeline.PipelineBase(config)[source]

Bases: ABC

__init__(config)[source]

PipelineBase is a base class for machine learning pipelines.

Parameters:

config (dict) – PipelineBase config.

_abc_impl = <_abc._abc_data object>
property _engine: Any

Uses config to retrieve flatiron engine subpackage.

Returns:

flatiron.tf or flatiron.torch

Return type:

Any

_logger(method, message, config)[source]

Retreives a logger given a message, config and slack flag.

Parameters:
  • method (str) – Name of method calling logger.

  • message (str) – Log message or Slack title.

  • config (dict) – Config dict.

Returns:

Configured logger instance.

Return type:

ficl.SlackLogger

build()[source]

Build machine learning model and assign it to self.model. Calls self.model_func with model params.

Returns:

Self.

Return type:

PipelineBase

compile()[source]

Sets self._compiled to a dictionary of compiled objects.

Returns:

Self.

Return type:

PipelineBase

classmethod from_string(text)[source]

Construct PipelineBase instance from given YAML text.

Parameters:

text (str) – YAML text.

Returns:

PipelineBase instance.

Return type:

PipelineBase

classmethod generate_config(framework='torch', project='project-name', callback_root='/tensorboard/parent/dir', dataset='/mnt/data/dataset', optimizer='SGD', loss='CrossEntropyLoss', metrics=['MeanMetric'])[source]

Prints a generated pipeline config based on given parameters.

Parameters:
  • framework (str) – Framework name. Default: torch.

  • project (str) – Project name. Default: project-name.

  • callback_root (str) – Callback root path. Default: /tensorboard/parent/dir.

  • dataset (str) – Dataset path. Default: /mnt/data/dataset.

  • optimizer (str) – Optimizer name. Default: SGD.

  • loss (str) – Loss name. Default: CrossEntropyLoss.

  • metrics (list[str]) – Metric names. Default: [‘MeanMetric’].

Return type:

None

load()[source]

Loads train and test datasets into memory. Calls load on self._train_data and self._test_data.

Raises:

RuntimeError – If train and test data are not datasets.

Returns:

Self.

Return type:

PipelineBase

abstract model_config()[source]

Subclasses of PipelineBase will need to define a config class for models created in the build method.

Returns:

Pydantic BaseModel config class.

Return type:

BaseModel

abstract model_func()[source]

Subclasses of PipelineBase need to define a function that builds and returns a machine learning model.

Returns:

Machine learning model.

Return type:

object

classmethod read_yaml(filepath)[source]

Construct PipelineBase instance from given yaml file.

Parameters:

filepath (str or Path) – YAML file.

Returns:

PipelineBase instance.

Return type:

PipelineBase

run()[source]

Run the following pipeline operations:

  • build

  • compile

  • train_test_split

  • load (for tensorflow only)

  • train

Returns:

Self.

Return type:

PipelineBase

train()[source]

Call model train function with params.

Returns:

Self.

Return type:

PipelineBase

train_test_split()[source]

Split dataset into train and test sets.

Assigns the following instance members:

  • _train_data

  • _test_data

Returns:

Self.

Return type:

PipelineBase

unload()[source]

Unload train and test datasets from memory. Calls unload on self._train_data and self._test_data.

Raises:
  • RuntimeError – If train and test data are not datasets.

  • RuntimeError – If train and test data are not loaded.

Returns:

Self.

Return type:

PipelineBase

resolve

flatiron.core.resolve._generate_config(framework='torch', project='project-name', callback_root='/tensorboard/parent/dir', dataset='/mnt/data/dataset', optimizer='SGD', loss='CrossEntropyLoss', metrics=['MeanMetric'])[source]

Generate a pipeline config based on given parameters.

Parameters:
  • framework (str) – Framework name. Default: torch.

  • project (str) – Project name. Default: project-name.

  • callback_root (str) – Callback root path. Default: /tensorboard/parent/dir.

  • dataset (str) – Dataset path. Default: /mnt/data/dataset.

  • optimizer (str) – Optimizer name. Default: SGD.

  • loss (str) – Loss name. Default: CrossEntropyLoss.

  • metrics (list[str]) – Metric names. Default: [‘MeanMetric’].

Returns:

Generated config.

Return type:

dict

flatiron.core.resolve._resolve_field(config, field)[source]

Resolve and validate given pipeline config field.

Parameters:
  • config (dict) – Pipeline config.

  • field (str) – Config field name.

Returns:

Updated pipeline config.

Return type:

dict

flatiron.core.resolve._resolve_model(config, model)[source]

Resolve and validate given model config.

Parameters:
  • config (dict) – Model config.

  • model (BaseModel) – Model config class.

Returns:

Validated model config.

Return type:

dict

flatiron.core.resolve._resolve_pipeline(config)[source]

Resolve and validate given pipeline config.

Parameters:

config (dict) – Pipeline config.

Returns:

Validated pipeline config.

Return type:

dict

flatiron.core.resolve._resolve_subconfig(subconfig, class_prefix, prepend, config_module, other_module)[source]

For use in _resolve_field. Resolves and validates given subconfig. If class is not custom definition found in config module or other module, a standard definition will be resolved from config module. class prefix and prepend are used to modify the config name field in order to make it a valid class name.

Parameters:
  • subconfig (dict) – Subconfig.

  • class_prefix (str) – Class prefix.

  • prepend (bool) – Prepend class prefix.

  • config_module (str) – Module name.

  • other_module (str) – Module name.

Returns:

Validated subconfig.

Return type:

dict

flatiron.core.resolve.resolve_config(config, model)[source]

Resolves given Pipeline config. Config fields include:

  • framework

  • model

  • dataset

  • optimizer

  • loss

  • metrics

  • callbacks

  • train

  • logger

Parameters:
  • config (dict) – Config dict.

  • model (BaseModel) – Model config class.

Returns:

Resolved config.

Return type:

dict

tools

flatiron.core.tools.enforce_callbacks(log_directory, checkpoint_pattern)[source]

Enforces callback parameters.

Parameters:
  • log_directory (str or Path) – Tensorboard project log directory.

  • checkpoint_pattern (str) – Filepath pattern for checkpoint callback.

Raises:
  • EnforceError – If log directory does not exist.

  • EnforeError – If checkpoint pattern does not contain ‘{epoch}’.

Return type:

None

flatiron.core.tools.enforce_getter(value)[source]

Enforces value is a dict with a name key.

Parameters:

value (dict) – Dict..

Raises:

EnforceError – Is not a dict with a name key.

Return type:

None

flatiron.core.tools.get_module(name)[source]

Get a module from a given name.

Parameters:

name (str) – Module name.

Raises:

NotImplementedError – If module is not found.

Returns:

Module.

Return type:

object

flatiron.core.tools.get_module_class(name, module)[source]

Get a class from a given module.

Parameters:
  • name (str) – Class name.

  • module (str) – Module name.

Raises:

NotImplementedError – If class is not found in module.

Returns:

Module class.

Return type:

class

flatiron.core.tools.get_module_function(name, module)[source]

Get a function from a given module.

Parameters:
  • name (str) – Function name.

  • module (str) – Module name.

Raises:

NotImplementedError – If function is not found in module.

Returns:

Module function.

Return type:

function

flatiron.core.tools.get_tensorboard_project(project, root='/mnt/storage', timezone='UTC', extension='keras')[source]

Creates directory structure for Tensorboard project.

Parameters:
  • project (str) – Name of project.

  • root (str or Path) – Tensorboard parent directory. Default: /mnt/storage

  • timezone (str, optional) – Timezone. Default: UTC.

  • extension (str, optional) – File extension. Options: [keras, safetensors]. Default: keras.

Raises:

EnforceError – If extension is not keras, pth or safetensors.

Returns:

Project details.

Return type:

dict

flatiron.core.tools.is_custom_definition(config, module)[source]

Determine whether config is of custom defined code.

Parameters:
  • config (dict) – Instance config.

  • module (str) – Always __name__.

Raises:

EnforceError – If config is not a dict with a name key.

Returns:

True if config is of custom defined code.

Return type:

bool

flatiron.core.tools.pad_layer_name(name, length=18)[source]

Pads underscores in a given layer name to make the string achieve a given length.

Parameters:
  • name (str) – Layer name to be padded.

  • length (int) – Length of output string. Default: 18.

Returns:

Padded layer name.

Return type:

str

flatiron.core.tools.resolve_kwargs(kwargs, engine, optimizer, return_type='both')[source]

Filter keyword arguments base on prefix and return them minus the prefix.

Parameters:
  • kwargs (dict) – Kwargs dict.

  • engine (str) – Deep learning framework.

  • optimizer (str) – Optimizer name.

  • return_type (str, optional) – Which kind of keys to return. Options: [prefixed, unprefixed, both]. Default: both.

Returns:

Resolved kwargs.

Return type:

dict

flatiron.core.tools.resolve_module_config(config, module)[source]

Given a config and set of modules return a validated dict.

Parameters:
  • config (dict) – Instance config.

  • module (str) – Always __name__.

Raises:

EnforceError – If config is not a dict with a name key.

Returns:

Resolved config dict.

Return type:

dict

flatiron.core.tools.slack_it(title, channel, url, config=None, stopwatch=None, timezone='UTC', suppress=False)[source]

Compose a message from given arguments and post it to slack.

Parameters:
  • title (str) – Post title.

  • channel (str) – Slack channel.

  • url (str) – Slack URL.

  • config (dict, optional) – Parameter dict. Default: None.

  • stopwatch (StopWatch, optional) – StopWatch instance. Default: None.

  • timezone (str, optional) – Timezone. Default: UTC.

  • suppress (bool, optional) – Return message, rather than post it to Slack. Default: False.

Returns:

Slack response.

Return type:

HTTPResponse

flatiron.core.tools.train_test_split(data, test_size=0.2, shuffle=True, seed=None, limit=None)[source]

Split DataFrame into train and test DataFrames.

Parameters:
  • data (pd.DataFrame) – DataFrame.

  • test_size (float, optional) – Test set size as a proportion. Default: 0.2.

  • shuffle (bool, optional) – Randomize data before splitting. Default: True.

  • seed (int, optional) – Seed number. Default: None.

  • limit (int, optional) – Limit the total length of train and test. Default: None.

Raises:
  • EnforceError – If data is not a DataFrame.

  • EnforceError – If test_size is not between 0 and 1.

Returns:

Train and test DataFrames.

Return type:

tuple[pd.DataFrame, pd.DataFrame]

flatiron.core.tools.unindent(text, spaces=4)[source]

Unindents given block of text according to given number of spaces.

Parameters:
  • text (str) – Text block to unindent.

  • spaces (int, optional) – Number of spaces to remove. Default: 4.

Returns:

Unindented text.

Return type:

str

validators

flatiron.core.validators.is_base_two(number)[source]

Validates that number is base two.

Parameters:

number (int) – Number.

Raises:

ValueError – If number is not base two.

Returns:

Input number.

Return type:

int

flatiron.core.validators.is_callback_mode(mode)[source]

Validates that mode is a legal calback mode.

Parameters:

mode (str) – Callback mode.

Raises:

ValueError – If mode type is not legal.

Returns:

Input callback mode.

Return type:

str

flatiron.core.validators.is_engine(engine)[source]

Validates that engine is a legal deep learning framework.

Parameters:

engine (str) – Deep learning framework.

Raises:

ValueError – If engine is not legal.

Returns:

Input engine.

Return type:

str

flatiron.core.validators.is_even(number)[source]

Validates that number is even.

Parameters:

number (int) – Number.

Raises:

ValueError – If number is not even.

Returns:

Input number.

Return type:

int

flatiron.core.validators.is_odd(number)[source]

Validates that number is odd.

Parameters:

number (int) – Number.

Raises:

ValueError – If number is not odd.

Returns:

Input number.

Return type:

int

flatiron.core.validators.is_padding(pad_type)[source]

Validates that pad_type is a legal padding type.

Parameters:

pad_type (str) – Padding type.

Raises:

ValueError – If padding type is not legal.

Returns:

Input padding type.

Return type:

str

flatiron.core.validators.is_pipeline_method(method)[source]

Validates that method is a legal pipeline method.

Parameters:

mode (str) – Pipeline method.

Raises:

ValueError – If method is not legal.

Returns:

Input pipeline method.

Return type:

str