tools

rolling_pin.tools.LOGGER = <Logger rolling_pin.tools (WARNING)>

Contains basic functions for more complex ETL functions and classes.

rolling_pin.tools.dot_to_html(dot, layout='dot', as_png=False)[source]

Converts a given pydot graph into a IPython.display.HTML object. Used in jupyter lab inline display of graph data.

Parameters
  • dot (pydot.Dot) – Pydot Graph instance.

  • layout (str, optional) – Graph layout style. Options include: circo, dot, fdp, neato, sfdp, twopi. Default: dot.

  • as_png (bool, optional) – Display graph as a PNG image instead of SVG. Useful for display on Github. Default: False.

Raises

ValueError – If invalid layout given.

Returns

HTML instance.

Return type

IPython.display.HTML

rolling_pin.tools.flatten(item, separator='/', embed_types=True)[source]

Flattens a iterable object into a flat dictionary.

Parameters
  • item (object) – Iterable object.

  • separator (str, optional) – Field separator in keys. Default: ‘/’.

Returns

Dictionary representation of given object.

Return type

dict

rolling_pin.tools.get_ordered_unique(items)[source]

Generates a unique list of items in same order they were received in.

Parameters

items (list) – List of items.

Returns

Unique ordered list.

Return type

list

rolling_pin.tools.get_parent_fields(key, separator='/')[source]

Get all the parent fields of a given key, split by given separator.

Parameters
  • key (str) – Key.

  • separator (str, optional) – String that splits key into fields. Default: ‘/’.

Returns

List of absolute parent fields.

Return type

list(str)

rolling_pin.tools.is_dictlike(item)[source]

Determines if given item is dict-like.

Parameters

item (object) – Object to be tested.

Returns

Whether given item is dict-like.

Return type

bool

rolling_pin.tools.is_iterable(item)[source]

Determines if given item is iterable.

Parameters

item (object) – Object to be tested.

Returns

Whether given item is iterable.

Return type

bool

rolling_pin.tools.is_listlike(item)[source]

Determines if given item is list-like.

Parameters

item (object) – Object to be tested.

Returns

Whether given item is list-like.

Return type

bool

rolling_pin.tools.list_all_files(directory)[source]

Recursively lists all files within a give directory.

Parameters

directory (str or Path) – Directory to be recursed.

Returns

List of filepaths.

Return type

list[Path]

rolling_pin.tools.nest(flat_dict, separator='/')[source]

Converts a flat dictionary into a nested dictionary by splitting keys by a given separator.

Parameters
  • flat_dict (dict) – Flat dictionary.

  • separator (str, optional) – Field separator within given dictionary’s keys. Default: ‘/’.

Returns

Nested dictionary.

Return type

dict

rolling_pin.tools.try_(function, item, exception_value='item')[source]

Try applying a given function to a given item. If that fails return the item or exception value.

Parameters
  • function (function) – Function that expects an item.

  • item (object) – Item to be processed by function.

  • exception_value (object, optional) – If left to ‘item’, returns item, else returns given value. Default: ‘item’.

Returns

Ouput of function(item) or exception_value.

Return type

object

rolling_pin.tools.unembed(item)[source]

Convert embeded types in dictionary keys into python types.

Parameters

item (object) – Dictionary with embedded types.

Returns

Converted object.

Return type

object

rolling_pin.tools.write_dot_graph(dot, fullpath, layout='dot')[source]

Writes a pydot.Dot object to a given filepath. Formats supported: svg, dot, png.

Parameters
  • dot (pydot.Dot) – Pydot Dot instance.

  • fulllpath (str or Path) – File to be written to.

  • layout (str, optional) – Graph layout style. Options include: circo, dot, fdp, neato, sfdp, twopi. Default: dot.

Raises

ValueError – If invalid file extension given.

utils

rolling_pin.utils.LOGGER = <Logger rolling_pin.utils (WARNING)>

Imports only python builtins and contains only function for use within or outside of Docker environment (ie no dependencies or specific python versions.)

rolling_pin.utils.api_function(wrapped=None, **kwargs)[source]

A decorator that enforces keyword argument only function signatures and required keyword argument values when called.

Parameters
  • wrapped (function) – For dev use. Default: None.

  • **kwargs (dict) – Keyword arguments. # noqa: W605

Raises
  • TypeError – If non-keyword argument found in functionn signature.

  • ValueError – If keyword arg with value of ‘<required>’ is found.

Returns

api function.

rolling_pin.utils.get_function_signature(function)[source]

Inspect a given function and return its arguments as a list and its keyword arguments as a dict.

Parameters

function (function) – Function to be inspected.

Returns

args and kwargs.

Return type

dict

rolling_pin.utils.is_standard_module(name)[source]

Determines if given module name is a python builtin.

Parameters

name (str) – Python module name.

Returns

Whether string names a python module.

Return type

bool

rolling_pin.utils.relative_path(module, path)[source]

Resolve path given current module’s file path and given suffix.

Parameters
  • module (str) – Always __file__ of current module.

  • path (str) – Path relative to __file__.

Returns

Resolved Path object.

Return type

Path

repo_etl

class rolling_pin.repo_etl.RepoETL(root, include_regex='.*\.py$', exclude_regex='(__init__|_test)\.py$')[source]

Bases: object

RepoETL is a class for extracting 1st order dependencies of modules within a given repository. This information is stored internally as a DataFrame and can be rendered as networkx, pydot or SVG graphs.

__dict__ = mappingproxy({'__module__': 'rolling_pin.repo_etl', '__doc__': '\n RepoETL is a class for extracting 1st order dependencies of modules within a\n given repository. This information is stored internally as a DataFrame and\n can be rendered as networkx, pydot or SVG graphs.\n ', '__init__': <function RepoETL.__init__>, '_get_imports': <staticmethod object>, '_get_data': <staticmethod object>, '_calculate_coordinates': <staticmethod object>, '_anneal_coordinate': <staticmethod object>, '_center_coordinate': <staticmethod object>, '_to_networkx_graph': <staticmethod object>, 'to_networkx_graph': <function RepoETL.to_networkx_graph>, 'to_dot_graph': <function RepoETL.to_dot_graph>, 'to_dataframe': <function RepoETL.to_dataframe>, 'to_html': <function RepoETL.to_html>, 'write': <function RepoETL.write>, '__dict__': <attribute '__dict__' of 'RepoETL' objects>, '__weakref__': <attribute '__weakref__' of 'RepoETL' objects>})
__init__(root, include_regex='.*\\.py$', exclude_regex='(__init__|_test)\\.py$')[source]

Construct RepoETL instance.

Parameters
  • root (str or Path) – Full path to repository root directory.

  • include_regex (str, optional) – Files to be included in recursive directy search. Default: ‘.*.py$’.

  • exclude_regex (str, optional) – Files to be excluded in recursive directy search. Default: ‘(__init__|_test).py$’.

Raises

ValueError – If include or exclude regex does not end in ‘.py$’.

__module__ = 'rolling_pin.repo_etl'
__weakref__

list of weak references to the object (if defined)

static _anneal_coordinate(data, anneal_axis='x', pin_axis='y', iterations=10)[source]

Iteratively align nodes in the anneal axis according to the mean position of their connected nodes. Node anneal coordinates are rectified at the end of each iteration according to a pin axis, so that they do not overlap. This mean that they are sorted at each level of the pin axis.

Parameters
  • data (pandas.DataFrame) – DataFrame with x column.

  • anneal_axis (str, optional) – Coordinate column to be annealed. Default: ‘x’.

  • pin_axis (str, optional) – Coordinate column to be held constant. Default: ‘y’.

  • iterations (int, optional) – Number of times to update x coordinates. Default: 10.

Returns

DataFrame with annealed anneal axis coordinates.

Return type

pandas.DataFrame

static _calculate_coordinates(data)[source]

Calculate inital x, y coordinates for each node in given DataFrame. Node are startified by type along the y axis.

Parameters

pandas.DataFrame – DataFrame of nodes.

Returns

DataFrame with x and y coordinate columns.

Return type

pandas.DataFrame

static _center_coordinate(data, center_axis='x', pin_axis='y')[source]

Sorted center_axis coordinates at each level of the pin axis.

Parameters
  • data (pandas.DataFrame) – DataFrame with x column.

  • anneal_column (str, optional) – Coordinate column to be annealed. Default: ‘x’.

  • pin_axis (str, optional) – Coordinate column to be held constant. Default: ‘y’.

  • iterations (int, optional) – Number of times to update x coordinates. Default: 10.

Returns

DataFrame with centered center axis coordinates.

Return type

pandas.DataFrame

static _get_data(root, include_regex='.*\\.py$', exclude_regex='(__init__|_test)\\.py$')[source]

Recursively aggregates and filters all the files found with a given directory into a DataFrame. Data is used to create directed graphs.

DataFrame has these columns:

  • node_name - name of node

  • node_type - type of node, can be [module, subpackage, library]

  • x - node’s x coordinate

  • y - node’s y coordinate

  • dependencies - parent nodes

  • subpackages - parent nodes of type subpackage

  • fullpath - fullpath to the module a node represents

Parameters
  • root (str or Path) – Root directory to be searched.

  • include_regex (str, optional) – Files to be included in recursive directy search. Default: ‘.*.py$’.

  • exclude_regex (str, optional) – Files to be excluded in recursive directy search. Default: ‘(__init__|_test).py$’.

Raises
  • ValueError – If include or exclude regex does not end in ‘.py$’.

  • FileNotFoundError – If no files are found after filtering.

Returns

DataFrame of file information.

Return type

pandas.DataFrame

static _get_imports(fullpath)[source]

Get’s import statements from a given python module.

Parameters

fullpath (str or Path) – Path to python module.

Returns

List of imported modules.

Return type

list(str)

static _to_networkx_graph(data)[source]

Converts given DataFrame into networkx directed graph.

Parameters

pandas.DataFrame – DataFrame of nodes.

Returns

Graph of nodes.

Return type

networkx.DiGraph

to_dataframe()[source]
Retruns:

pandas.DataFrame: DataFrame of nodes representing repo modules.

to_dot_graph(orient='tb', orthogonal_edges=False, color_scheme=None)[source]

Converts internal data into pydot graph.

Parameters
  • orient (str, optional) –

    Graph layout orientation. Default: tb. Options include:

    • tb - top to bottom

    • bt - bottom to top

    • lr - left to right

    • rl - right to left

  • orthogonal_edges (bool, optional) – Whether graph edges should have non-right angles. Default: False.

  • color_scheme – (dict, optional): Color scheme to be applied to graph. Default: rolling_pin.tools.COLOR_SCHEME

Raises

ValueError – If orient is invalid.

Returns

Dot graph of nodes.

Return type

pydot.Dot

to_html(layout='dot', orthogonal_edges=False, color_scheme=None, as_png=False)[source]

For use in inline rendering of graph data in Jupyter Lab.

Parameters
  • layout (str, optional) – Graph layout style. Options include: circo, dot, fdp, neato, sfdp, twopi. Default: dot.

  • orthogonal_edges (bool, optional) – Whether graph edges should have non-right angles. Default: False.

  • color_scheme – (dict, optional): Color scheme to be applied to graph. Default: rolling_pin.tools.COLOR_SCHEME

  • as_png (bool, optional) – Display graph as a PNG image instead of SVG. Useful for display on Github. Default: False.

Returns

HTML object for inline display.

Return type

IPython.display.HTML

to_networkx_graph()[source]

Converts internal data into networkx directed graph.

Returns

Graph of nodes.

Return type

networkx.DiGraph

write(fullpath, layout='dot', orient='tb', orthogonal_edges=False, color_scheme=None)[source]

Writes internal data to a given filepath. Formats supported: svg, dot, png, json.

Parameters
  • fulllpath (str or Path) – File to be written to.

  • layout (str, optional) – Graph layout style. Options include: circo, dot, fdp, neato, sfdp, twopi. Default: dot.

  • orient (str, optional) –

    Graph layout orientation. Default: tb. Options include:

    • tb - top to bottom

    • bt - bottom to top

    • lr - left to right

    • rl - right to left

  • orthogonal_edges (bool, optional) – Whether graph edges should have non-right angles. Default: False.

  • color_scheme – (dict, optional): Color scheme to be applied to graph. Default: rolling_pin.tools.COLOR_SCHEME

Raises

ValueError – If invalid file extension given.

blob_etl

class rolling_pin.blob_etl.BlobETL(blob, separator='/')[source]

Bases: object

Converts blob data internally into a flat dictionary that is universally searchable, editable and convertable back to the data’s original structure, new blob structures or dircted graphs.

__dict__ = mappingproxy({'__module__': 'rolling_pin.blob_etl', '__doc__': "\n Converts blob data internally into a flat dictionary that is universally\n searchable, editable and convertable back to the data's original structure,\n new blob structures or dircted graphs.\n ", '__init__': <function BlobETL.__init__>, 'query': <function BlobETL.query>, 'filter': <function BlobETL.filter>, 'delete': <function BlobETL.delete>, 'set': <function BlobETL.set>, 'update': <function BlobETL.update>, 'set_field': <function BlobETL.set_field>, 'to_dict': <function BlobETL.to_dict>, 'to_flat_dict': <function BlobETL.to_flat_dict>, 'to_records': <function BlobETL.to_records>, 'to_dataframe': <function BlobETL.to_dataframe>, 'to_prototype': <function BlobETL.to_prototype>, 'to_networkx_graph': <function BlobETL.to_networkx_graph>, 'to_dot_graph': <function BlobETL.to_dot_graph>, 'to_html': <function BlobETL.to_html>, 'write': <function BlobETL.write>, '__dict__': <attribute '__dict__' of 'BlobETL' objects>, '__weakref__': <attribute '__weakref__' of 'BlobETL' objects>})
__init__(blob, separator='/')[source]

Contructs BlobETL instance.

Parameters
  • blob (object) – Iterable object.

  • separator (str, optional) – String to be used as a field separator in each key. Default: ‘/’.

__module__ = 'rolling_pin.blob_etl'
__weakref__

list of weak references to the object (if defined)

delete(predicate, by='key')[source]

Delete data items by key, value or key + value, according to a given predicate.

Parameters
  • predicate – Function that returns a boolean value.

  • by (str, optional) – Value handed to predicate. Options include: key, value, key+value. Default: key.

Raises

ValueError – If by keyword is not key, value, or key+value.

Returns

New BlobETL instance.

Return type

BlobETL

filter(predicate, by='key')[source]

Filter data items by key, value or key + value, according to a given predicate.

Parameters
  • predicate – Function that returns a boolean value.

  • by (str, optional) – Value handed to predicate. Options include: key, value, key+value. Default: key.

Raises

ValueError – If by keyword is not key, value, or key+value.

Returns

New BlobETL instance.

Return type

BlobETL

query(regex, ignore_case=True)[source]

Filter data items by key according to given regular expression.

Parameters
  • regex (str) – Regular expression.

  • ignore_casd (bool, optional) – Whether to consider case in the regular expression search. Default: False.

Returns

New BlobETL instance.

Return type

BlobETL

set(predicate=None, key_setter=None, value_setter=None)[source]

Filter data items by key, value or key + value, according to a given predicate. Then set that items key by a given function and value by a given function.

Parameters
  • predicate (function, optional) – Function of the form: lambda k, v: bool. Default: None –> lambda k, v: True.

  • key_setter (function, optional) – Function of the form: lambda k, v: str. Default: None –> lambda k, v: k.

  • value_setter (function, optional) – Function of the form: lambda k, v: object. Default: None –> lambda k, v: v.

Returns

New BlobETL instance.

Return type

BlobETL

set_field(index, field_setter)[source]

Set’s a field at a given index according to a given function.

Parameters
  • index (int) – Field index.

  • field_setter (functon) – Function of form lambda str: str.

Returns

New BlobETL instance.

Return type

BlobETL

to_dataframe(group_by=None)[source]

Convert data to pandas DataFrame.

Parameters

group_by (int, optional) – Field index to group rows of data by. Default: None.

Returns

DataFrame.

Return type

pandas.DataFrame

to_dict()[source]
Returns

Nested representation of internal data.

Return type

dict

to_dot_graph(orthogonal_edges=False, orient='tb', color_scheme=None)[source]

Converts internal dictionary into pydot graph. Key and value nodes and edges are colored differently.

Parameters
  • orthogonal_edges (bool, optional) – Whether graph edges should have non-right angles. Default: False.

  • orient (str, optional) –

    Graph layout orientation. Default: tb. Options include:

    • tb - top to bottom

    • bt - bottom to top

    • lr - left to right

    • rl - right to left

  • color_scheme – (dict, optional): Color scheme to be applied to graph. Default: rolling_pin.tools.COLOR_SCHEME

Raises

ValueError – If orient is invalid.

Returns

Dot graph representation of dictionary.

Return type

pydot.Dot

to_flat_dict()[source]
Returns

Flat dictionary with embedded types.

Return type

dict

to_html(layout='dot', orthogonal_edges=False, orient='tb', color_scheme=None, as_png=False)[source]

For use in inline rendering of graph data in Jupyter Lab.

Parameters
  • layout (str, optional) – Graph layout style. Options include: circo, dot, fdp, neato, sfdp, twopi. Default: dot.

  • orthogonal_edges (bool, optional) – Whether graph edges should have non-right angles. Default: False.

  • orient (str, optional) –

    Graph layout orientation. Default: tb. Options include:

    • tb - top to bottom

    • bt - bottom to top

    • lr - left to right

    • rl - right to left

  • color_scheme – (dict, optional): Color scheme to be applied to graph. Default: rolling_pin.tools.COLOR_SCHEME

  • as_png (bool, optional) – Display graph as a PNG image instead of SVG. Useful for display on Github. Default: False.

Returns

HTML object for inline display.

Return type

IPython.display.HTML

to_networkx_graph()[source]

Converts internal dictionar into a networkx directed graph.

Returns

Graph representation of dictionary.

Return type

networkx.DiGraph

to_prototype()[source]

Convert data to prototypical representation.

Example:

>>> data = {
'users': [
        {
            'name': {
                'first': 'tom',
                'last': 'smith',
            }
        },{
            'name': {
                'first': 'dick',
                'last': 'smith',
            }
        },{
            'name': {
                'first': 'jane',
                'last': 'doe',
            }
        },
    ]
}
>>> BlobETL(data).to_prototype().to_dict()
{
    '^users': {
        '<list_[0-9]+>': {
            'name': {
                'first$': Counter({'dick': 1, 'jane': 1, 'tom': 1}),
                'last$': Counter({'doe': 1, 'smith': 2})
            }
        }
    }
}
Returns

New BlobETL instance.

Return type

BlobETL

to_records()[source]
Returns

Data in records format.

Return type

list[dict]

update(item)[source]

Updates internal dictionary with given dictionary or BlobETL instance. Given dictionary is first flattened with embeded types.

Parameters

item (dict or BlobETL) – Dictionary to be used for update.

Returns

New BlobETL instance.

Return type

BlobETL

write(fullpath, layout='dot', orthogonal_edges=False, orient='tb', color_scheme=None)[source]

Writes internal dictionary to a given filepath. Formats supported: svg, dot, png, json.

Parameters
  • fulllpath (str or Path) – File tobe written to.

  • layout (str, optional) – Graph layout style. Options include: circo, dot, fdp, neato, sfdp, twopi. Default: dot.

  • orthogonal_edges (bool, optional) – Whether graph edges should have non-right angles. Default: False.

  • orient (str, optional) –

    Graph layout orientation. Default: tb. Options include:

    • tb - top to bottom

    • bt - bottom to top

    • lr - left to right

    • rl - right to left

  • color_scheme – (dict, optional): Color scheme to be applied to graph. Default: rolling_pin.tools.COLOR_SCHEME

Raises

ValueError – If invalid file extension given.

radon_etl

class rolling_pin.radon_etl.RadonETL(fullpath)[source]

Bases: object

Conforms all four radon reports (raw metrics, Halstead, maintainability and cyclomatic complexity) into a single DataFrame that can then be plotted.

__dict__ = mappingproxy({'__module__': 'rolling_pin.radon_etl', '__doc__': '\n Conforms all four radon reports (raw metrics, Halstead, maintainability and\n cyclomatic complexity) into a single DataFrame that can then be plotted.\n ', '__init__': <function RadonETL.__init__>, 'report': <property object>, 'data': <property object>, 'raw_metrics': <property object>, 'maintainability_index': <property object>, 'cyclomatic_complexity_metrics': <property object>, 'halstead_metrics': <property object>, '_get_radon_data': <function RadonETL._get_radon_data>, '_get_radon_report': <staticmethod object>, '_get_raw_metrics_dataframe': <staticmethod object>, '_get_maintainability_index_dataframe': <staticmethod object>, '_get_cyclomatic_complexity_dataframe': <staticmethod object>, '_get_halstead_dataframe': <staticmethod object>, 'write_plots': <function RadonETL.write_plots>, 'write_tables': <function RadonETL.write_tables>, '__dict__': <attribute '__dict__' of 'RadonETL' objects>, '__weakref__': <attribute '__weakref__' of 'RadonETL' objects>})
__init__(fullpath)[source]

Constructs a RadonETL instance.

Parameters

fullpath (str or Path) – Python file or directory of python files.

__module__ = 'rolling_pin.radon_etl'
__weakref__

list of weak references to the object (if defined)

static _get_cyclomatic_complexity_dataframe(report)[source]

Converts radon cyclomatic complexity report into a pandas DataFrame.

Parameters

report (dict) – Radon report blob.

Returns

Cyclomatic complexity DataFrame.

Return type

pandas.DataFrame

static _get_halstead_dataframe(report)[source]

Converts radon Halstead report into a pandas DataFrame.

Parameters

report (dict) – Radon report blob.

Returns

Halstead DataFrame.

Return type

pandas.DataFrame

static _get_maintainability_index_dataframe(report)[source]

Converts radon maintainability index report into a pandas DataFrame.

Parameters

report (dict) – Radon report blob.

Returns

Maintainability DataFrame.

Return type

pandas.DataFrame

_get_radon_data()[source]

Constructs a DataFrame representing all the radon reports generated for a given python file or directory containing python files.

Returns

Radon report DataFrame.

Return type

pandas.DataFrame

static _get_radon_report(fullpath)[source]

Gets all 4 report from radon and aggregates them into a single blob object.

Parameters

fullpath (str or Path) – Python file or directory of python files.

Returns

Radon report blob.

Return type

dict

static _get_raw_metrics_dataframe(report)[source]

Converts radon raw metrics report into a pandas DataFrame.

Parameters

report (dict) – Radon report blob.

Returns

Raw metrics DataFrame.

Return type

pandas.DataFrame

property cyclomatic_complexity_metrics

pandas.DataFrame: DataFrame of radon cyclomatic complexity metrics.

property data

pandas.DataFrame: DataFrame of all radon metrics.

property halstead_metrics

pandas.DataFrame: DataFrame of radon Halstead metrics.

property maintainability_index

pandas.DataFrame: DataFrame of radon maintainability index metrics.

property raw_metrics

pandas.DataFrame: DataFrame of radon raw metrics.

property report

dict: Dictionary of all radon metrics.

write_plots(fullpath)[source]

Writes metrics plots to given file.

Parameters

fullpath (Path or str) – Target file.

Returns

self.

Return type

RadonETL

write_tables(target_dir)[source]

Writes metrics tables as HTML files to given directory.

Parameters

target_dir (Path or str) – Target directory.

Returns

self.

Return type

RadonETL

app

rolling_pin.app.get_svg(data='<required>', layout='dot', orthogonal_edges=False, orient='tb', color_scheme=None)[source]

Generate a SVG string from a given JSON blob.

Parameters
  • data (dict or list) – JSON blob.

  • layout (str, optional) – Graph layout style. Options include: circo, dot, fdp, neato, sfdp, twopi. Default: dot.

  • orthogonal_edges (bool, optional) –

    Whether graph edges should have

    non-right angles. Default: False.

    orient (str, optional): Graph layout orientation. Default: tb.

    Options include:

    • tb - top to bottom

    • bt - bottom to top

    • lr - left to right

    • rl - right to left

    color_scheme: (dict, optional): Color scheme to be applied to graph.

    Default: rolling_pin.tools.COLOR_SCHEME

Returns

SVG string.

Return type

str

rolling_pin.app.index()[source]
rolling_pin.app.to_svg()[source]

Endpoint for converting a given JSON blob into a SVG graph.