Input function package

input_fn.any_record_exists(split, record_dir, record_pattern)[source]

Indicator if any record of given pattern exists

Parameters:
  • split (str) – Split descriptor (“train”, “eval”, “test”)
  • record_dir (str) – Directory where records are contained
  • record_pattern (str) – Pattern for record files
Returns:

True, if at least one record with record_pattern exists in record_dir.

Return type:

bool

input_fn.dataset_from_records(split, record_dir, record_pattern, random_seed)[source]

Load all records as a file list

Parameters:
  • split (str) – Split descriptor (“train”, “eval”, “test”)
  • record_dir (str) – Directory where records are contained
  • record_pattern (str) – Pattern for record files
  • random_seed – control randomness i.e. reproducibility
Returns:

List of record_file strings

Return type:

dataset (tf.data.Dataset)

input_fn.input_fn(split, batch_size, buffer_size, num_parallel, compression, random_seed)[source]

Generic input function for use with tf.estimator.Estimator

Returns a input_fn as consumed by e.g. Estimator.train(input_fn())

Parameters:
  • split (str) – Split descriptor (“train”, “eval”, “test”)
  • batch_size (int) – Batch size
  • buffer_size (int) – Number of objects that are buffered and prefetched
  • num_parallel (int) – Number of parallel threads for map, and RecordReader
  • compression (str) – Which compression was used during serialization (“NONE”, “GZIP”, “ZLIB”)
  • random_seed (int) – control randomness i.e. reproducibility
Ingredient functions:
map_fn(): from input_fn_ingred, defines parsing of features.
Returns:Input function as consumed by Estimator.train/evaluate/predict
Return type:input_fn (function)
input_fn.map_fn(feature_keys, label_keys, keys_to_parsers, keys_to_handlers)[source]

Wrapper for batch parsing

Parameters:
  • feature_keys (list of str) – Keys which should be included in feature_dict
  • label_keys (list of str) – Keys which should be included as label_dict
  • keys_to_parsers (dict) – Mapping of keys to parser instances
  • keys_to_handlers (dict) – Mapping of keys to handler instances
Returns:

Function which maps a serial batch to (features, labels)

Return type:

fn (function)

input_fn.shuffle_repeat_prefetch(dataset, buffer_size, n_epochs, random_seed)[source]

Wrapper for shuffle+repeat+prefetch

Shuffle, repeat, and prefetch a dataset

Parameters:
  • dataset (tf.data.Dataset) – Dataset to be processed
  • buffer_size (int) – Number of objects to be buffered, and prefetched
  • n_epochs (int) – Number of dataset repetitions
  • random_seed – control randomness i.e. reproducibility
Returns:

Processed dataset

Return type:

dataset (tf.data.Dataset)