Skip to content

Commit

Permalink
FEAT Function to visualize skops files (#317)
Browse files Browse the repository at this point in the history
  • Loading branch information
BenjaminBossan authored Mar 22, 2023
1 parent 7ae70ed commit 1646b18
Show file tree
Hide file tree
Showing 12 changed files with 734 additions and 7 deletions.
3 changes: 3 additions & 0 deletions docs/changes.rst
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,9 @@ v0.6
- ``add_*`` methods on :class:`.Card` now have default section names (but
``None`` is no longer valid) and no longer add descriptions by default.
:pr:`321` by `Benjamin Bossan`_.
- Add possibility to visualize a skops object and show untrusted types by using
:func:`skops.io.visualize`. For colored output, install `rich`: `pip install
rich`. :pr:`317` by `Benjamin Bossan`_.

v0.5
----
Expand Down
58 changes: 58 additions & 0 deletions docs/persistence.rst
Original file line number Diff line number Diff line change
Expand Up @@ -134,6 +134,64 @@ For example, to convert all ``.pkl`` flies in the current directory:
Further help for the different supported options can be found by calling
``skops convert --help`` in a terminal.

Visualization
#############

Skops files can be visualized using :func:`skops.io.visualize`. If you have
a skops file called ``my-model.skops``, you can visualize it like this:

.. code:: python
import skops.io as sio
sio.visualize("my-model.skops")
The output could look like this:

.. code::
root: sklearn.preprocessing._data.MinMaxScaler
└── attrs: builtins.dict
├── feature_range: builtins.tuple
│ ├── content: json-type(-555)
│ └── content: json-type(123)
├── copy: unsafe_lib.UnsafeType [UNSAFE]
├── clip: json-type(false)
└── _sklearn_version: json-type("1.2.0")
``unsafe_lib.UnsafeType`` was recognized as untrusted and marked.

It's also possible to visualize the object dumped as bytes:

import skops.io as sio
my_model = ...
sio.visualize(sio.dumps(my_model))

There are various options to customize the output. By default, the security of
nodes is color coded if `rich <https://github.com/Textualize/rich>`_ is
installed, otherwise they all have the same color. To install ``rich``, run:

.. code::
python -m pip install rich
or, when installing skops, install it like this:

python -m pip install skops[rich]

To disable colors, even if ``rich`` is installed, pass ``use_colors=False`` to
:func:`skops.io.visualize`.

It's also possible to change what colors are being used, e.g. by passing
``visualize(..., color_safe="cyan")`` to change the color for trusted nodes from
green to cyan. The ``rich`` docs list the `supported standard colors
<https://rich.readthedocs.io/en/stable/appendix/colors.html>`_.

Note that the visualization feature is intended to help understand the structure
of the object, e.g. what attributes are identified as untrusted. It is not a
replacement for a proper security check. In particular, just because an object's
visualization looks innocent does *not* mean you can just call `sio.load(<file>,
trusted=True)` on this object -- only pass the types you really trust to the
``trusted`` argument.

Supported libraries
-------------------
Expand Down
1 change: 1 addition & 0 deletions setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -77,6 +77,7 @@ def setup_package():
extras_require={
"docs": min_deps.tag_to_packages["docs"],
"tests": min_deps.tag_to_packages["tests"],
"rich": min_deps.tag_to_packages["rich"],
},
include_package_data=True,
)
Expand Down
6 changes: 4 additions & 2 deletions skops/_min_dependencies.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,8 @@
# 'build' and 'install' is included to have structured metadata for CI.
# It will NOT be included in setup's extras_require
# The values are (version_spec, comma separated tags, condition)
# tags can be: 'build', 'install', 'docs', 'examples', 'tests', 'benchmark'
# tags can be: 'build', 'install', 'docs', 'examples', 'tests', 'benchmark',
# 'rich'
# example:
# "tomli": ("1.1.0", "install", "python_full_version < '3.11.0a7'"),
dependent_packages = {
Expand Down Expand Up @@ -34,13 +35,14 @@
# TODO: remove condition when catboost supports python 3.11
"catboost": ("1.0", "tests", "python_version < '3.11'"),
"fairlearn": ("0.7.0", "docs, tests", None),
"rich": ("12", "tests, rich", None),
}


# create inverse mapping for setuptools
tag_to_packages: dict = {
extra: []
for extra in ["build", "install", "docs", "examples", "tests", "benchmark"]
for extra in ["build", "install", "docs", "examples", "tests", "benchmark", "rich"]
}
for package, (min_version, extras, condition) in dependent_packages.items():
for extra in extras.split(", "):
Expand Down
13 changes: 13 additions & 0 deletions skops/conftest.py
Original file line number Diff line number Diff line change
Expand Up @@ -43,3 +43,16 @@ def mock_import(name, *args, **kwargs):
yield

import matplotlib # noqa


@pytest.fixture
def rich_not_installed():
orig_import = builtins.__import__

def mock_import(name, *args, **kwargs):
if name == "rich":
raise ImportError
return orig_import(name, *args, **kwargs)

with patch("builtins.__import__", side_effect=mock_import):
yield
3 changes: 2 additions & 1 deletion skops/io/__init__.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
from ._persist import dump, dumps, get_untrusted_types, load, loads
from ._visualize import visualize

__all__ = ["dumps", "load", "loads", "dump", "get_untrusted_types"]
__all__ = ["dumps", "load", "loads", "dump", "get_untrusted_types", "visualize"]
4 changes: 4 additions & 0 deletions skops/io/_audit.py
Original file line number Diff line number Diff line change
Expand Up @@ -287,6 +287,10 @@ def get_unsafe_set(self) -> set[str]:

return res

def format(self) -> str:
"""Representation of the node's content."""
return f"{self.module_name}.{self.class_name}"


class CachedNode(Node):
def __init__(
Expand Down
8 changes: 8 additions & 0 deletions skops/io/_general.py
Original file line number Diff line number Diff line change
Expand Up @@ -482,6 +482,14 @@ def get_unsafe_set(self) -> set[str]:
def _construct(self):
return json.loads(self.content)

def format(self) -> str:
"""Representation of the node's content.
Since no module is used, just show the content.
"""
return f"json-type({self.content})"


def bytes_get_state(obj: Any, save_context: SaveContext) -> dict[str, Any]:
f_name = f"{uuid.uuid4()}.bin"
Expand Down
4 changes: 2 additions & 2 deletions skops/io/_scipy.py
Original file line number Diff line number Diff line change
Expand Up @@ -40,9 +40,9 @@ def __init__(
trusted: bool | Sequence[str] = False,
) -> None:
super().__init__(state, load_context, trusted)
type = state["type"]
self.type = state["type"]
self.trusted = self._get_trusted(trusted, [spmatrix])
if type != "scipy":
if self.type != "scipy":
raise TypeError(
f"Cannot load object of type {self.module_name}.{self.class_name}"
)
Expand Down
Loading

0 comments on commit 1646b18

Please sign in to comment.