srlearn

Overview

srlearn: A library for learning SRL models with a scikit-learn-style programming interface.

LGTM code quality analysis GitHub CI Builds Code coverage status Total Downloads Monthly Downloads


Motivation: Crash Course in relational representations

Not all prediction problems fit into nice 2-dimensional vectors \(X\) and 1-dimensional labels \(y\). Network data is often better represented as a collection of entities, their attributes, and their relationships:

Simple network visualization showing six people, their friendships with one another, and whether the person smokes.
cancer(alice).
cancer(bob).
cancer(chuck).
cancer(fred).
friends(alice,bob).
friends(alice,fred).
friends(chuck,bob).
friends(chuck,fred).
...
friends(fred,alice).
friends(bob,chuck).
friends(fred,chuck).
friends(bob,dan).
friends(bob,earl).
smokes(alice).
smokes(chuck).
smokes(bob).

Basic Usage

The general setup is similar to libraries like scikit-learn or Keras that follow: (1) initialize, (2) fit, and (3) predict.

from srlearn.rdn import BoostedRDNClassifier
from srlearn import Background
from srlearn.datasets import load_toy_cancer

train, test = load_toy_cancer()

# Background knowledge about the domain, and its constraints
bk = Background(modes=train.modes)

# Instantiate a model to learn about cancer diagnoses
clf = BoostedRDNClassifier(
    background=bk,
    target="cancer",
)

clf.fit(train)

clf.predict_proba(test)
# array([0.88079619, 0.88079619, 0.88079619, 0.3075821 , 0.3075821 ])

Furthermore, it includes utilities for model visualization and serialization:

from srlearn.plotting import export_digraph, plot_digraph

plot_digraph(export_digraph(clf, 0))

Tree-structured network where smokes(A) implies cancer

So we learned that if a person A smokes, then that person is likely to also have cancer, or:

cancer(A) :- smokes(A).

Installation

The latest stable version can be installed from PyPi using pip:

pip install srlearn