Utils package¶
-
class
utils.PrintOnce(message)[source]¶ Bases:
objectUtility to print a message only once
Parameters: message – Formattable string
-
utils.aggregate(df, class_to_filter, group_key=None, builder=None)[source]¶ Aggregate class samples from dataframe
Group rows of df by group_key, and use class filters to annotate each row. If builder is provided, group annotated rows again by class and group_key, and apply builder.
Parameters: - df – Dataframe
- class_to_filter (dict) – Mapping of class names to filter functions. Filter functions take a Dataframe as input and return bool, which indicates if sample belongs to class.
- group_key (str) – Which columns to use for grouping.
- builder – Function that accepts Dataframe, and returns a processed Dataframe.
Returns: - Aggregated Dataframe, where each row
represents one sample. The column “class” indicates the class of the sample.
Return type: Dataframe
-
utils.approx_square(x)[source]¶ Calculate even grid height and width
Assumes that x is a power of 2!
Parameters: x (int) – Assumed to be a power of 2 Returns: - [h, w] such that x = 2**w * 2**h, and w,h as close to sqrt(x) as
- possible
Return type: list Examples
>>> approx_square(32) [4, 8] >>> approx_square(64) [8, 8] >>> approx_square(128) [8, 16]
-
utils.build_pairs(df, cond=<function <lambda>>, no_duplicate=[])[source]¶ Cartesian Product of dataframe
Produce all combinations of df x df where cond is true.
Parameters: - df – Dataframe
- cond – function with two arguments, returning bool
- no_duplicate (str) – Keys which should not be duplicated.
Returns: - Containing all pairs of df x df, where
cond was true.
Return type: DataFrame
Examples
>>> df = pd.DataFrame({"age": [1,2,3], "gender": ["m", "f", "m"]}) >>> cond = lambda x,y: y["age"].values > x["age"].values >>> build_pairs(df, cond) age_0 gender_0 age_1 gender_1 0 1 m 2 f 1 1 m 3 m 2 2 f 3 m
-
utils.bytes_feature(value)[source]¶ Convert value to TF bytes feature
Used during serialization of features
Parameters: value – instance of bytes, bytes list, or np.array Returns: tf.train.Feature
-
utils.filter_groups(df, class_to_filter, group_key=None)[source]¶ Separate dataframe into classes defined by filter
Add a class column to df, indicating class defined by filter functions.
Parameters: Returns: Input df extended by a column “class”.
Return type: Dataframe
Examples
>>> df = pd.DataFrame({"subject": [1,1,2,2,3,3], "gender": [1,1,0,0,0,1]}) >>> class_to_filter = {"male": lambda x: x["gender"].all(), ... "female": lambda x: not x["gender"].any(), ... "trans": lambda x: not x["gender"].all() and x["gender"].any()} >>> filter_groups(df, class_to_filter, "subject") subject gender class 0 1 1 male 1 1 1 male 2 2 0 female 3 2 0 female 4 3 0 trans 5 3 1 trans
-
utils.flip_idx(series)[source]¶ Return index of first flip from “1” to “0”
It is assumed, that the series starts with a block of “1”, then the index of the last element of that block is returned, otherwise -1.
Parameters: series – list, np.array, or pd.Series Examples
>>> flip_idx([1,1,1,0,0]) 2 >>> flip_idx([1,1,0,1,1]) 1 >>> flip_idx([1,1,1,1,1]) # No flip! -1 >>> flip_idx([0,0,1,1,1]) # No flip "1" to "0"! -1
Returns: - Index of the last “1” before flipping to “0”,
- -1 if no flip from “1” to “0”.
Return type: int
-
utils.float32_feature(value)[source]¶ Convert value to TF float32 feature
Used during serialization of features
Parameters: value – instance of float, float list, or np.flot Returns: tf.train.Feature
-
utils.grid_size_from(tensor, axis=0)[source]¶ Calculate appoximate grid size from axis
Assumes that length of axis is a power of 2!
Parameters: - tensor – Typically a batch of images
- axis (int) – Which axis to use for approximate grid size. Default: 0
Returns: Approximate grid size [h, w]
Return type:
-
utils.hasflip(df, sort_key='age', from_key='mci', to_key='ad', within=None)[source]¶ Detect binary flip between two Series
Check if from_key flips from 1 to 0 at some point, and if to_key flips from 0 to 1 at the same time.
Parameters: df – Dataframe Returns: True, if flip occurs. Return type: bool Examples
>>> w = pd.DataFrame({"age": [1,2,3,4], "mci": [1,1,0,0], "ad": [0,0,1,1]}) >>> hasflip(w) True >>> x = pd.DataFrame({"age": [1,2,3,4], "mci": [1,1,1,1], "ad": [0,0,0,0]}) >>> hasflip(x) False >>> y = pd.DataFrame({"age": [1,2,3,4], "mci": [1,1,0,1], "ad": [0,0,1,0]}) >>> hasflip(y) False >>> z = pd.DataFrame({"age": [1,2,3,4], "mci": [1,1,1,0], "ad": [0,0,0,1]}) >>> hasflip(z, within=2) False
-
utils.img_grid_summary(name, tensor)[source]¶ Create a image grid summary
Parameters: - name (str) – Name for image grid summary
- tensor – Image tensor with shape [batch_size, img_height, img_width, channels].
Returns: None
-
utils.int64_feature(value)[source]¶ Convert value to TF int64 feature
Used during serialization of features
Parameters: value – instance of int, int list, or np.int Returns: tf.train.Feature
-
utils.is_one_after(series, idx)[source]¶ Check if all elements are “1” after idx.
Parameters: - series – list, np.array, or pd.Series
- idx (int) – Last index before check (not included)
Returns: True, if all elements after idx are “1”.
Return type: Examples
>>> is_one_after([0,0,1,1,1], 1) True >>> is_one_after([0,0,1,1,0], 1) False
-
utils.ispowerof2(x)[source]¶ Check if x is a power of 2
Examples
>>> ispowerof2(64) True >>> ispowerof2(63) False
-
utils.mkdir_and_join(record_dir, record_pattern)[source]¶ Parameters: Returns: Full record file pattern, including directoy
Return type: (str)
-
utils.scale(x, newmin=-1, newmax=1)[source]¶ Scale all entries in x between newmin and newmax
Parameters: x (array) – Array to be scaled Returns: Scaled array with same shape as x Return type: array Examples
>>> scale([-2, 0, 2]) array([-1., 0., 1.]) >>> scale([0, 1, 4]) array([-1. , -0.5, 1. ])
-
utils.slice_from(axis, position, n_dims=3)[source]¶ Convert axis and position into slice
Parameters: Returns: Slice tuple for indexing into an array
Return type: Examples
>>> slice_from(0, 2) (2, slice(None, None, None), slice(None, None, None)) >>> slice_from(0, 2, n_dims=2) (2, slice(None, None, None))
-
utils.spans(df, key, mode, span)[source]¶ Check if a numeric column spans a certain range.
Parameters: Returns: True, if numeric span of key complies with mode.
Return type: Examples
>>> x = pd.DataFrame({"age": [1,2,3,4]}) >>> spans(x, "age", "at_least", 2) True >>> spans(x, "age", "at_most", 2) False >>> spans(x, "age", "more_than", 2) True >>> spans(x, "age", "less_than", 2) False