Utils package

class utils.PrintOnce(message)[source]

Bases: object

Utility to print a message only once

Parameters:message – Formattable string
print(*args, **kwargs)[source]
utils.aggregate(df, class_to_filter, group_key=None, builder=None)[source]

Aggregate class samples from dataframe

Group rows of df by group_key, and use class filters to annotate each row. If builder is provided, group annotated rows again by class and group_key, and apply builder.

Parameters:
  • df – Dataframe
  • class_to_filter (dict) – Mapping of class names to filter functions. Filter functions take a Dataframe as input and return bool, which indicates if sample belongs to class.
  • group_key (str) – Which columns to use for grouping.
  • builder – Function that accepts Dataframe, and returns a processed Dataframe.
Returns:

Aggregated Dataframe, where each row

represents one sample. The column “class” indicates the class of the sample.

Return type:

Dataframe

utils.approx_square(x)[source]

Calculate even grid height and width

Assumes that x is a power of 2!

Parameters:x (int) – Assumed to be a power of 2
Returns:
[h, w] such that x = 2**w * 2**h, and w,h as close to sqrt(x) as
possible
Return type:list

Examples

>>> approx_square(32)
[4, 8]
>>> approx_square(64)
[8, 8]
>>> approx_square(128)
[8, 16]
utils.build_pairs(df, cond=<function <lambda>>, no_duplicate=[])[source]

Cartesian Product of dataframe

Produce all combinations of df x df where cond is true.

Parameters:
  • df – Dataframe
  • cond – function with two arguments, returning bool
  • no_duplicate (str) – Keys which should not be duplicated.
Returns:

Containing all pairs of df x df, where

cond was true.

Return type:

DataFrame

Examples

>>> df = pd.DataFrame({"age": [1,2,3], "gender": ["m", "f", "m"]})
>>> cond = lambda x,y: y["age"].values > x["age"].values
>>> build_pairs(df, cond)
  age_0 gender_0 age_1 gender_1
0     1        m     2        f
1     1        m     3        m
2     2        f     3        m
utils.bytes_feature(value)[source]

Convert value to TF bytes feature

Used during serialization of features

Parameters:value – instance of bytes, bytes list, or np.array
Returns:tf.train.Feature
utils.filter_groups(df, class_to_filter, group_key=None)[source]

Separate dataframe into classes defined by filter

Add a class column to df, indicating class defined by filter functions.

Parameters:
  • df – Dataframe containing samples
  • class_to_filter (dict) – Mapping of class names to filter functions. Filter functions take a Dataframe as input and return bool, which indicates if sample belongs to class.
  • group_key (str) – Group df by group_key, if not None.
Returns:

Input df extended by a column “class”.

Return type:

Dataframe

Examples

>>> df = pd.DataFrame({"subject": [1,1,2,2,3,3], "gender": [1,1,0,0,0,1]})
>>> class_to_filter = {"male": lambda x: x["gender"].all(),
...                    "female": lambda x: not x["gender"].any(),
...                    "trans": lambda x: not x["gender"].all() and x["gender"].any()}
>>> filter_groups(df, class_to_filter, "subject")
   subject  gender   class
0        1       1    male
1        1       1    male
2        2       0  female
3        2       0  female
4        3       0   trans
5        3       1   trans
utils.flip_idx(series)[source]

Return index of first flip from “1” to “0”

It is assumed, that the series starts with a block of “1”, then the index of the last element of that block is returned, otherwise -1.

Parameters:series – list, np.array, or pd.Series

Examples

>>> flip_idx([1,1,1,0,0])
2
>>> flip_idx([1,1,0,1,1])
1
>>> flip_idx([1,1,1,1,1]) # No flip!
-1
>>> flip_idx([0,0,1,1,1]) # No flip "1" to "0"!
-1
Returns:
Index of the last “1” before flipping to “0”,
-1 if no flip from “1” to “0”.
Return type:int
utils.float32_feature(value)[source]

Convert value to TF float32 feature

Used during serialization of features

Parameters:value – instance of float, float list, or np.flot
Returns:tf.train.Feature
utils.grid_size_from(tensor, axis=0)[source]

Calculate appoximate grid size from axis

Assumes that length of axis is a power of 2!

Parameters:
  • tensor – Typically a batch of images
  • axis (int) – Which axis to use for approximate grid size. Default: 0
Returns:

Approximate grid size [h, w]

Return type:

list

utils.hasflip(df, sort_key='age', from_key='mci', to_key='ad', within=None)[source]

Detect binary flip between two Series

Check if from_key flips from 1 to 0 at some point, and if to_key flips from 0 to 1 at the same time.

Parameters:df – Dataframe
Returns:True, if flip occurs.
Return type:bool

Examples

>>> w = pd.DataFrame({"age": [1,2,3,4], "mci": [1,1,0,0], "ad": [0,0,1,1]})
>>> hasflip(w)
True
>>> x = pd.DataFrame({"age": [1,2,3,4], "mci": [1,1,1,1], "ad": [0,0,0,0]})
>>> hasflip(x)
False
>>> y = pd.DataFrame({"age": [1,2,3,4], "mci": [1,1,0,1], "ad": [0,0,1,0]})
>>> hasflip(y)
False
>>> z = pd.DataFrame({"age": [1,2,3,4], "mci": [1,1,1,0], "ad": [0,0,0,1]})
>>> hasflip(z, within=2)
False
utils.img_grid_summary(name, tensor)[source]

Create a image grid summary

Parameters:
  • name (str) – Name for image grid summary
  • tensor – Image tensor with shape [batch_size, img_height, img_width, channels].

Returns: None

utils.int64_feature(value)[source]

Convert value to TF int64 feature

Used during serialization of features

Parameters:value – instance of int, int list, or np.int
Returns:tf.train.Feature
utils.is_one_after(series, idx)[source]

Check if all elements are “1” after idx.

Parameters:
  • series – list, np.array, or pd.Series
  • idx (int) – Last index before check (not included)
Returns:

True, if all elements after idx are “1”.

Return type:

bool

Examples

>>> is_one_after([0,0,1,1,1], 1)
True
>>> is_one_after([0,0,1,1,0], 1)
False
utils.ispowerof2(x)[source]

Check if x is a power of 2

Examples

>>> ispowerof2(64)
True
>>> ispowerof2(63)
False
utils.mkdir_and_join(record_dir, record_pattern)[source]
Parameters:
  • record_dir (str) – Path to record directory
  • record_pattern (str) – Record file pattern
Returns:

Full record file pattern, including directoy

Return type:

(str)

utils.scale(x, newmin=-1, newmax=1)[source]

Scale all entries in x between newmin and newmax

Parameters:x (array) – Array to be scaled
Returns:Scaled array with same shape as x
Return type:array

Examples

>>> scale([-2, 0, 2])
array([-1.,  0.,  1.])
>>> scale([0, 1, 4])
array([-1. , -0.5,  1. ])
utils.slice_from(axis, position, n_dims=3)[source]

Convert axis and position into slice

Parameters:
  • axis (int) – Slicing axis
  • position (int) – Position of slice along axis
Returns:

Slice tuple for indexing into an array

Return type:

tuple

Examples

>>> slice_from(0, 2)
(2, slice(None, None, None), slice(None, None, None))
>>> slice_from(0, 2, n_dims=2)
(2, slice(None, None, None))
utils.spans(df, key, mode, span)[source]

Check if a numeric column spans a certain range.

Parameters:
  • df (Dataframe) – Dataframe containing samples
  • key (str) – Numeric column in df
  • mode (str) – one of “at_least”, “at_most”, “more_than”, “less_than”.
  • span (int, float) – Numeric range that is tested.
Returns:

True, if numeric span of key complies with mode.

Return type:

bool

Examples

>>> x = pd.DataFrame({"age": [1,2,3,4]})
>>> spans(x, "age", "at_least", 2)
True
>>> spans(x, "age", "at_most", 2)
False
>>> spans(x, "age", "more_than", 2)
True
>>> spans(x, "age", "less_than", 2)
False
utils.to_int_size(split_to_size, n_samples)[source]