Analysing Machine Learning Models With Imandra

In this notebook we show how Imandra can be used to analyse and reason about models that have been learnt from data, an important and exciting topic bridging the gap between formal methods and machine learning (ML). Brief notes and some links are included, but for a fuller explanation (with code snippets included for reference) see our corresponding Medium post. You can also find all of our code for both learning and analysing our models on GitHub.

To illustrate this approach we'll be looking at (relatively simple) examples from two of the most common tasks within supervised learning (and ML more generally): classification and regression. In particular, we'll show how two of the most common kinds of model used to perform these tasks, random forests and neural networks, can be analysed using Imandra. For each task we'll use a real-world benchmark dataset from the UCI Machine Learning Repository and create our models using Python with some standard ML libraries.

We'll mostly be working with reals in this notebook so we'll start by installing a pretty printer so that we're not overrun with digits.

In [1]:
let pp_approx fmt r = CCFormat.fprintf fmt "%s" (Real.to_string_approx r) [@@program]
#install_printer pp_approx
Out[1]:
val pp_approx : Format.formatter -> real -> unit = <fun>

Classification

In a classification task we want to learn to predict the label of a datapoint based on previous data. In the classic Wisconsin Breast Cancer (Diagnostic) dataset) the task is to predict whether the cancer is benign or malignant based on the features of cell nuclei. In the dataset we have the following variables:

1. ID number
2. Diagnosis (malignant or benign)
3-32. Real values for the mean, standard error, and the 'worst' value for each cell nucleus'
      a) Radius
      b) Texture
      c) Perimeter
      d) Area
      e) Smoothness
      f) Compactness
      g) Concavity
      h) Concave points
      i) Symmetry
      j) Fractal dimension

As is standard practice we pre-process the data before learning. First we standardise each variable to have zero mean and unit variance, then remove all but one from sets of highly correlated variables, along with those that have low mutual information with respect to the target variable. The data is split into training (80%) and test (20%) sets and we use Scikit-Learn to learn a random forest of 3 decision trees of maximum depth 3. As this is a relatively straightforward problem even this simple model achieves a fairly high accuracy. Using a short Python script each tree is then converted to Imandra Modelling Language (IML) and can be reasoned about using Imandra.

In [2]:
let tree_0 (f_0 : real) (f_1 : real) (f_2 : real) (f_3 : real) (f_4 : real) (f_5 : real) (f_6 : real) = let open Real in
  if f_2 <=. (-0.10815) then
    if f_0 <=. (0.26348) then
      if f_6 <=. (-0.06176) then
        (236.0, 1.0)
      else
        (17.0, 5.0)
    else
      if f_3 <=. (-0.54236) then
        (8.0, 2.0)
      else
        (3.0, 7.0)
  else
    if f_6 <=. (0.09812) then
      if f_6 <=. (-0.17063) then
        (24.0, 0.0)
      else
        (4.0, 2.0)
    else
      if f_2 <=. (2.65413) then
        (6.0, 128.0)
      else
        (7.0, 5.0);;

let tree_1 (f_0 : real) (f_1 : real) (f_2 : real) (f_3 : real) (f_4 : real) (f_5 : real) (f_6 : real) = let open Real in
  if f_5 <=. (-0.05799) then
    if f_0 <=. (0.68524) then
      if f_1 <=. (-0.83180) then
        (110.0, 3.0)
      else
        (137.0, 0.0)
    else
      if f_3 <=. (0.45504) then
        (1.0, 8.0)
      else
        (0.0, 7.0)
  else
    if f_0 <=. (-0.18668) then
      if f_6 <=. (0.45214) then
        (39.0, 0.0)
      else
        (2.0, 11.0)
    else
      if f_6 <=. (-0.00009) then
        (8.0, 4.0)
      else
        (5.0, 120.0);;

let tree_2 (f_0 : real) (f_1 : real) (f_2 : real) (f_3 : real) (f_4 : real) (f_5 : real) (f_6 : real) = let open Real in
  if f_2 <=. (0.10459) then
    if f_5 <=. (-0.38015) then
      if f_5 <=. (-0.60659) then
        (139.0, 1.0)
      else
        (44.0, 3.0)
    else
      if f_6 <=. (-0.07927) then
        (38.0, 2.0)
      else
        (25.0, 17.0)
  else
    if f_6 <=. (0.46888) then
      if f_3 <=. (0.41642) then
        (28.0, 3.0)
      else
        (1.0, 4.0)
    else
      if f_2 <=. (1.74327) then
        (3.0, 122.0)
      else
        (4.0, 21.0);;

let rf (f_0, f_1, f_2, f_3, f_4, f_5, f_6) = let open Real in
let (a_0, b_0) = tree_0 f_0 f_1 f_2 f_3 f_4 f_5 f_6 in
let (a_1, b_1) = tree_1 f_0 f_1 f_2 f_3 f_4 f_5 f_6 in
let (a_2, b_2) = tree_2 f_0 f_1 f_2 f_3 f_4 f_5 f_6 in
let a = a_0 + a_1 + a_2 in
let b = b_0 + b_1 + b_2 in
(a, b);;
Out[2]:
val tree_0 :
  real -> real -> real -> real -> real -> real -> real -> real * real = <fun>
val tree_1 :
  real -> real -> real -> real -> real -> real -> real -> real * real = <fun>
val tree_2 :
  real -> real -> real -> real -> real -> real -> real -> real * real = <fun>
val rf : real * real * real * real * real * real * real -> real * real =
  <fun>

We can create a custom input type in Imandra for our model, so that we can keep track of the different features of our data.

In [3]:
type rf_input = {
  radius_mean : real;
  compactness_mean : real;
  concavity_mean : real;
  radius_se : real;
  compactness_worst : real;
  concavity_worst : real;
  concave_points_worst : real;
}
Out[3]:
type rf_input = {
  radius_mean : real;
  compactness_mean : real;
  concavity_mean : real;
  radius_se : real;
  compactness_worst : real;
  concavity_worst : real;
  concave_points_worst : real;
}

However, remember that we also processed our data before learning. To make things easier we'll add in a function applying this transformation to each input variable. Here we simply use some multiplicative and additive scaling values extracted during our data pre-processing stage. After that we can define a full model which combines these pre-processing steps and the random forest.

In [4]:
let process_rf_input input = let open Real in
let f_0 = (input.radius_mean          - 14.12729) / 3.52405 in
let f_1 = (input.compactness_mean     - 0.10434)  / 0.05281 in
let f_2 = (input.concavity_mean       - 0.08880)  / 0.07972 in
let f_3 = (input.radius_se            - 0.40517)  / 0.27731 in
let f_4 = (input.compactness_worst    - 0.25427)  / 0.15734 in
let f_5 = (input.concavity_worst      - 0.27219)  / 0.20862 in
let f_6 = (input.concave_points_worst - 0.11461)  / 0.06573 in
(f_0, f_1, f_2, f_3, f_4, f_5, f_6)

let process_rf_output c =
let (a, b) = c in
if a >. b then "benign" else "malignant"

let rf_model input = input |> process_rf_input |> rf |> process_rf_output
Out[4]:
val process_rf_input :
  rf_input -> real * real * real * real * real * real * real = <fun>
val process_rf_output : real * real -> string = <fun>
val rf_model : rf_input -> string = <fun>

As our model is fully executable we can both query it as well as find counterexamples, prove properties, apply logical side-conditions, decompose its regions, and more. As a quick sanity check to make sure everything is working, let's run a datum from our dataset through the model. In particular, we'll input (17.99, 0.2776, 0.3001, 1.095, 0.6656, 0.7119, 0.2654) which is classified as malignant in the data.

In [5]:
let x = {
  radius_mean = 17.99;
  compactness_mean = 0.2776;
  concavity_mean = 0.3001;
  radius_se = 1.095;
  compactness_worst = 0.6656;
  concavity_worst = 0.7119;
  concave_points_worst = 0.7119;
}

let y = rf_model x
Out[5]:
val x : rf_input =
  {radius_mean = 17.99; compactness_mean = 0.2776; concavity_mean = 0.3001;
   radius_se = 1.095; compactness_worst = 0.6656; concavity_worst = 0.7119;
   concave_points_worst = 0.7119}
val y : string = "malignant"

Great, just what we'd expect. Now we'll use Imandra to generate an example datapoint for us given that diagnosis is benign.

In [6]:
instance (fun x -> rf_model x = "benign")
Out[6]:
- : rf_input -> bool = <fun>
module CX : sig val x : rf_input end
Instance (after 0 steps, 0.040s):
 let (x : rf_input) =
  {radius_mean = -1142.0; compactness_mean = -15921.0;
   concavity_mean = -28100.0; radius_se = 281.0; compactness_worst = 9.0;
   concavity_worst = -20537.0;
   concave_points_worst = (Real.mk_of_string "86761759/78125000")}
Instance
proof attempt
ground_instances0
definitions0
inductions0
search_time
0.040s
details
Expand
smt_stats
num checks2
arith-make-feasible11
arith-max-columns40
rlimit count3501
mk clause69
datatype occurs check25
mk bool var150
arith-lower14
decisions21
arith-propagations6
propagations42
arith-bound-propagations-cheap6
arith-max-rows24
datatype accessor ax28
datatype constructor ax4
num allocs873473771
final checks1
added eqs87
arith eq adapter6
arith-upper8
memory21.070000
max memory21.700000
Expand
  • start[0.040s]
      let (_x_0 : real) = ….2 in
      let (_x_1 : bool) = _x_0 <=. -2163/20000 in
      let (_x_2 : bool) = ….5 <=. -5799/100000 in
      let (_x_3 : bool) = _x_0 <=. 10459/100000 in
      (if ((if _x_1 then … else …).0 +. (if _x_2 then … else …).0
           +. (if _x_3 then … else …).0)
          >.
          ((if _x_1 then … else …).1 +. (if _x_2 then … else …).1
           +. (if _x_3 then … else …).1)
       then "benign" else "malignant")
      = "benign"
  • simplify

    into
    let (_x_0 : real) = :var_0:.concavity_mean in
    let (_x_1 : real) = :var_0:.concave_points_worst in
    let (_x_2 : real) = :var_0:.radius_se in
    let (_x_3 : real) = :var_0:.radius_mean in
    let (_x_4 : (real * real))
        = if _x_0 <=. 40089141/500000000
          then
            if _x_3 <=. 7527903347/500000000
            then if _x_1 <=. 8636759/78125000 then … else …
            else if _x_2 <=. 636920371/2500000000 then … else …
          else
          if _x_1 <=. 302648569/2500000000
          then if _x_1 <=. 1033944901/10000000000 then … else …
          else if _x_0 <=. 750968109/2500000000 then … else …
    in
    let (_x_5 : real) = :var_0:.concavity_worst in
    let (_x_6 : (real * real))
        = if _x_5 <=. 1300460631/5000000000
          then
            if _x_3 <=. 8271055011/500000000
            then
              if :var_0:.compactness_mean <=. 30206321/500000000 then …
              else …
            else if _x_2 <=. 166049107/312500000 then … else …
          else
          if _x_3 <=. 6734710173/500000000
          then if _x_1 <=. 721645811/5000000000 then … else …
          else if _x_1 <=. 1146040843/10000000000 then … else …
    in
    let (_x_7 : (real * real))
        = if _x_0 <=. 242844787/2500000000
          then
            if _x_5 <=. 192883107/1000000000
            then if _x_5 <=. 728215971/5000000000 then … else …
            else if _x_1 <=. 1093995829/10000000000 then … else …
          else
          if _x_1 <=. 181786853/1250000000
          then if _x_2 <=. 2603237151/5000000000 then … else …
          else if _x_0 <=. 569433711/2500000000 then … else …
    in
    (if (_x_4.0 +. _x_6.0 +. _x_7.0) <=. (_x_4.1 +. _x_6.1 +. _x_7.1)
     then "malignant" else "benign")
    = "benign"
    expansions
    []
    rewrite_steps
      forward_chaining
      • Sat (Some let (x : rf_input) = {radius_mean = -1142.0; compactness_mean = -15921.0; concavity_mean = -28100.0; radius_se = 281.0; compactness_worst = 9.0; concavity_worst = -20537.0; concave_points_worst = (Real.mk_of_string "86761759/78125000")} )
      In [7]:
      CX.x
      
      Out[7]:
      - : rf_input =
      {radius_mean = -1142.; compactness_mean = -15921.; concavity_mean = -28100.;
       radius_se = 281.; compactness_worst = 9.; concavity_worst = -20537.;
       concave_points_worst = 1.1105505152}
      

      This looks a bit funny however; notice how the unspecified input variables are unbounded in a way that doesn't make sense with respect to the data. In general we might only care about the performance of our model when some reasonable bounds are placed on the input (for example, the mean radius can't be negative, and if the values for this variable in our dataset range between 6.98 and 28.11 we wouldn't really expect any value greater than, say, 35). Using the description of each variable in the dataset we can form a condition describing valid and reasonable inputs to our model. In machine learning more generally, we are typically only interested in the performance and quality of a model over some particular distribution of data, which we often have particular prior beliefs about.

      In [8]:
      let is_valid_rf input =
        5.0 <=. input.radius_mean && input.radius_mean <=. 35.0 &&
        0.0 <=. input.compactness_mean && input.compactness_mean <=. 0.4 &&
        0.0 <=. input.concavity_mean && input.concavity_mean <=. 0.5 &&
        0.0 <=. input.radius_se && input.radius_se <=. 3.5 &&
        0.0 <=. input.compactness_worst && input.compactness_worst <=. 1.2 &&
        0.0 <=. input.concavity_worst && input.concavity_worst <=. 1.5 &&
        0.0 <=. input.concave_points_worst && input.concave_points_worst <=. 0.35
      
      instance (fun x -> rf_model x = "benign" && is_valid_rf x)
      
      Out[8]:
      val is_valid_rf : rf_input -> bool = <fun>
      - : rf_input -> bool = <fun>
      module CX : sig val x : rf_input end
      
      Instance (after 0 steps, 0.054s):
       let (x : rf_input) =
        {radius_mean = 5.0; compactness_mean = 0.0; concavity_mean = 0.0;
         radius_se = 0.0; compactness_worst = 0.0; concavity_worst = 0.0;
         concave_points_worst = 0.0}
      
      Instance
      proof attempt
      ground_instances0
      definitions0
      inductions0
      search_time
      0.054s
      details
      Expand
      smt_stats
      num checks2
      arith-assume-eqs7
      arith-make-feasible21
      arith-max-columns46
      rlimit count4510
      arith-cheap-eqs8
      mk clause95
      datatype occurs check175
      mk bool var183
      arith-lower29
      decisions30
      arith-propagations11
      propagations51
      interface eqs7
      arith-bound-propagations-cheap11
      arith-max-rows29
      datatype accessor ax28
      arith-bound-propagations-lp1
      datatype constructor ax4
      num allocs911124778
      final checks8
      added eqs97
      arith eq adapter13
      arith-upper19
      memory23.940000
      max memory24.590000
      Expand
      • start[0.054s]
          let (_x_0 : real) = ….2 in
          let (_x_1 : bool) = _x_0 <=. -2163/20000 in
          let (_x_2 : bool) = ….5 <=. -5799/100000 in
          let (_x_3 : bool) = _x_0 <=. 10459/100000 in
          let (_x_4 : real) = :var_0:.radius_mean in
          let (_x_5 : real) = :var_0:.compactness_mean in
          let (_x_6 : real) = :var_0:.concavity_mean in
          let (_x_7 : real) = :var_0:.radius_se in
          let (_x_8 : real) = :var_0:.compactness_worst in
          let (_x_9 : real) = :var_0:.concavity_worst in
          let (_x_10 : real) = :var_0:.concave_points_worst in
          (if ((if _x_1 then … else …).0 +. (if _x_2 then … else …).0
               +. (if _x_3 then … else …).0)
              >.
              ((if _x_1 then … else …).1 +. (if _x_2 then … else …).1
               +. (if _x_3 then … else …).1)
           then "benign" else "malignant")
          = "benign"
          && 5 <=. _x_4
             && _x_4 <=. 35
                && 0 <=. _x_5
                   && _x_5 <=. 2/5
                      && 0 <=. _x_6
                         && _x_6 <=. 1/2
                            && 0 <=. _x_7
                               && _x_7 <=. 7/2
                                  && 0 <=. _x_8
                                     && _x_8 <=. 6/5
                                        && 0 <=. _x_9
                                           && _x_9 <=. 3/2
                                              && 0 <=. _x_10 && _x_10 <=. 7/20
      • simplify

        into
        let (_x_0 : real) = :var_0:.concavity_mean in
        let (_x_1 : real) = :var_0:.concave_points_worst in
        let (_x_2 : real) = :var_0:.radius_se in
        let (_x_3 : real) = :var_0:.radius_mean in
        let (_x_4 : (real * real))
            = if _x_0 <=. 40089141/500000000
              then
                if _x_3 <=. 7527903347/500000000
                then if _x_1 <=. 8636759/78125000 then … else …
                else if _x_2 <=. 636920371/2500000000 then … else …
              else
              if _x_1 <=. 302648569/2500000000
              then if _x_1 <=. 1033944901/10000000000 then … else …
              else if _x_0 <=. 750968109/2500000000 then … else …
        in
        let (_x_5 : real) = :var_0:.compactness_mean in
        let (_x_6 : real) = :var_0:.concavity_worst in
        let (_x_7 : (real * real))
            = if _x_6 <=. 1300460631/5000000000
              then
                if _x_3 <=. 8271055011/500000000
                then if _x_5 <=. 30206321/500000000 then … else …
                else if _x_2 <=. 166049107/312500000 then … else …
              else
              if _x_3 <=. 6734710173/500000000
              then if _x_1 <=. 721645811/5000000000 then … else …
              else if _x_1 <=. 1146040843/10000000000 then … else …
        in
        let (_x_8 : (real * real))
            = if _x_0 <=. 242844787/2500000000
              then
                if _x_6 <=. 192883107/1000000000
                then if _x_6 <=. 728215971/5000000000 then … else …
                else if _x_1 <=. 1093995829/10000000000 then … else …
              else
              if _x_1 <=. 181786853/1250000000
              then if _x_2 <=. 2603237151/5000000000 then … else …
              else if _x_0 <=. 569433711/2500000000 then … else …
        in
        let (_x_9 : real) = :var_0:.compactness_worst in
        ((((((((((((((if (_x_4.0 +. _x_7.0 +. _x_8.0) <=.
                         (_x_4.1 +. _x_7.1 +. _x_8.1)
                      then "malignant" else "benign")
                     = "benign" && 5 <=. _x_3)
                    && _x_3 <=. 35)
                   && 0 <=. _x_5)
                  && _x_5 <=. 2/5)
                 && 0 <=. _x_0)
                && _x_0 <=. 1/2)
               && 0 <=. _x_2)
              && _x_2 <=. 7/2)
             && 0 <=. _x_9)
            && _x_9 <=. 6/5)
           && 0 <=. _x_6)
          && _x_6 <=. 3/2)
         && 0 <=. _x_1)
        && _x_1 <=. 7/20
        expansions
        []
        rewrite_steps
          forward_chaining
          • Sat (Some let (x : rf_input) = {radius_mean = 5.0; compactness_mean = 0.0; concavity_mean = 0.0; radius_se = 0.0; compactness_worst = 0.0; concavity_worst = 0.0; concave_points_worst = 0.0} )
          In [9]:
          CX.x
          
          Out[9]:
          - : rf_input =
          {radius_mean = 5.; compactness_mean = 0.; concavity_mean = 0.;
           radius_se = 0.; compactness_worst = 0.; concavity_worst = 0.;
           concave_points_worst = 0.}
          

          This looks much better. Now let's move on to reasoning about our model in more interesting ways. One thing we can do is check the validity of certain constraints we might want our model to satisfy. For example, if the surface of a cell nucleus has many, large concave sections then is a particularly negative sign indicating that the cancer is likely to be malignant. We can use Imandra to easily verify that our model always classifies a sample of highly concave cells as malignant.

          In [10]:
          verify (fun x -> is_valid_rf x
                  && x.concavity_mean >=. 0.4
                  && x.concavity_worst >=. 1.0
                  && x.concave_points_worst >=. 0.25
                  ==> rf_model x = "malignant")
          
          Out[10]:
          - : rf_input -> bool = <fun>
          
          Proved
          proof
          ground_instances0
          definitions0
          inductions0
          search_time
          0.038s
          details
          Expand
          smt_stats
          num checks2
          arith-make-feasible7
          arith-max-columns41
          arith-conflicts2
          rlimit count4225
          mk clause51
          mk bool var173
          arith-lower16
          decisions2
          arith-propagations15
          propagations22
          arith-bound-propagations-cheap15
          arith-max-rows24
          conflicts2
          datatype accessor ax28
          datatype constructor ax4
          num allocs949715940
          added eqs94
          del clause4
          arith eq adapter8
          arith-upper30
          memory26.740000
          max memory27.340000
          Expand
          • start[0.038s]
              let (_x_0 : real) = :var_0:.radius_mean in
              let (_x_1 : real) = :var_0:.compactness_mean in
              let (_x_2 : real) = :var_0:.concavity_mean in
              let (_x_3 : real) = :var_0:.radius_se in
              let (_x_4 : real) = :var_0:.compactness_worst in
              let (_x_5 : real) = :var_0:.concavity_worst in
              let (_x_6 : real) = :var_0:.concave_points_worst in
              let (_x_7 : real) = ….2 in
              let (_x_8 : bool) = _x_7 <=. -2163/20000 in
              let (_x_9 : bool) = ….5 <=. -5799/100000 in
              let (_x_10 : bool) = _x_7 <=. 10459/100000 in
              (5 <=. _x_0
               && _x_0 <=. 35
                  && 0 <=. _x_1
                     && _x_1 <=. 2/5
                        && 0 <=. _x_2
                           && _x_2 <=. 1/2
                              && 0 <=. _x_3
                                 && _x_3 <=. 7/2
                                    && 0 <=. _x_4
                                       && _x_4 <=. 6/5
                                          && 0 <=. _x_5
                                             && _x_5 <=. 3/2
                                                && 0 <=. _x_6 && _x_6 <=. 7/20)
              && _x_2 >=. 2/5 && _x_5 >=. 1 && _x_6 >=. 1/4
              ==> (if ((if _x_8 then … else …).0 +. (if _x_9 then … else …).0
                       +. (if _x_10 then … else …).0)
                      >.
                      ((if _x_8 then … else …).1 +. (if _x_9 then … else …).1
                       +. (if _x_10 then … else …).1)
                   then "benign" else "malignant")
                  = "malignant"
          • simplify

            into
            let (_x_0 : real) = :var_0:.radius_mean in
            let (_x_1 : real) = :var_0:.compactness_mean in
            let (_x_2 : real) = :var_0:.concavity_mean in
            let (_x_3 : real) = :var_0:.radius_se in
            let (_x_4 : real) = :var_0:.compactness_worst in
            let (_x_5 : real) = :var_0:.concavity_worst in
            let (_x_6 : real) = :var_0:.concave_points_worst in
            let (_x_7 : (real * real))
                = if _x_2 <=. 40089141/500000000
                  then
                    if _x_0 <=. 7527903347/500000000
                    then if _x_6 <=. 8636759/78125000 then … else …
                    else if _x_3 <=. 636920371/2500000000 then … else …
                  else
                  if _x_6 <=. 302648569/2500000000
                  then if _x_6 <=. 1033944901/10000000000 then … else …
                  else if _x_2 <=. 750968109/2500000000 then … else …
            in
            let (_x_8 : (real * real))
                = if _x_5 <=. 1300460631/5000000000
                  then
                    if _x_0 <=. 8271055011/500000000
                    then if _x_1 <=. 30206321/500000000 then … else …
                    else if _x_3 <=. 166049107/312500000 then … else …
                  else
                  if _x_0 <=. 6734710173/500000000
                  then if _x_6 <=. 721645811/5000000000 then … else …
                  else if _x_6 <=. 1146040843/10000000000 then … else …
            in
            let (_x_9 : (real * real))
                = if _x_2 <=. 242844787/2500000000
                  then
                    if _x_5 <=. 192883107/1000000000
                    then if _x_5 <=. 728215971/5000000000 then … else …
                    else if _x_6 <=. 1093995829/10000000000 then … else …
                  else
                  if _x_6 <=. 181786853/1250000000
                  then if _x_3 <=. 2603237151/5000000000 then … else …
                  else if _x_2 <=. 569433711/2500000000 then … else …
            in
            not
            ((((((((((((((((5 <=. _x_0 && _x_0 <=. 35) && 0 <=. _x_1) && _x_1 <=. 2/5)
                         && 0 <=. _x_2)
                        && _x_2 <=. 1/2)
                       && 0 <=. _x_3)
                      && _x_3 <=. 7/2)
                     && 0 <=. _x_4)
                    && _x_4 <=. 6/5)
                   && 0 <=. _x_5)
                  && _x_5 <=. 3/2)
                 && 0 <=. _x_6)
                && _x_6 <=. 7/20)
               && _x_2 >=. 2/5)
              && _x_5 >=. 1)
             && _x_6 >=. 1/4)
            || (if (_x_7.0 +. _x_8.0 +. _x_9.0) <=. (_x_7.1 +. _x_8.1 +. _x_9.1)
                then "malignant" else "benign")
               = "malignant"
            expansions
            []
            rewrite_steps
              forward_chaining
              • Unsat

              The nested if ... then ... else statements in how the trees are defined mean that they are a prime candidate for Imandra's region decomposition functionality. As well as the total model we can of course also decompose the individual trees making up the ensemble.

              In [11]:
              Modular_decomp.top ~assuming:"is_valid_rf" "rf_model"
              
              Out[11]:
              - : Modular_decomposition.t =
              {Imandra_interactive.Modular_decomp.MD.md_session = 1i;
               md_f =
                {Imandra_surface.Uid.name = "rf_model"; id = <abstr>;
                 special_tag = <abstr>; namespace = <abstr>;
                 chash = Some OdDjCbSrRn0cKge4O2CXI4yUH5xJkTvG8tDCwV3WegI;
                 depth = (5i, 2i)};
               md_args = [(input : rf_input)]; md_regions = <abstr>}
              
              Regions details

              No group selected.

              • Concrete regions are numbered
              • Unnumbered regions are groups whose children share a particular constraint
              • Click on a region to view its details
              • Double click on a region to zoom in on it
              • Shift+double click to zoom out
              • Hit escape to reset back to the top
              decomp of (rf_model input
              Reg_idConstraintsInvariant
              511
              • not (((input.concavity_mean -. 111/1250) /. 1993/25000) <=. 10459/100000)
              • not
                (((input.concave_points_worst -. 11461/100000) /. 6573/100000) <=. 5861/12500)
              • not (((input.concavity_mean -. 111/1250) /. 1993/25000) <=. 174327/100000)
              • ((input.concavity_worst -. 27219/100000) /. 10431/50000) <=. -5799/100000
              • ((input.radius_mean -. 1412729/100000) /. 70481/20000) <=. 17131/25000
              • ((input.compactness_mean -. 5217/50000) /. 5281/100000) <=. -4159/5000
              • not (((input.concavity_mean -. 111/1250) /. 1993/25000) <=. -2163/20000)
              • not
                (((input.concave_points_worst -. 11461/100000) /. 6573/100000) <=. 2453/25000)
              • not (((input.concavity_mean -. 111/1250) /. 1993/25000) <=. 265413/100000)
              • 5 <=. input.radius_mean
              • input.radius_mean <=. 35
              • 0 <=. input.compactness_mean
              • input.compactness_mean <=. 2/5
              • 0 <=. input.concavity_mean
              • input.concavity_mean <=. 1/2
              • 0 <=. input.radius_se
              • input.radius_se <=. 7/2
              • 0 <=. input.compactness_worst
              • input.compactness_worst <=. 6/5
              • 0 <=. input.concavity_worst
              • input.concavity_worst <=. 3/2
              • 0 <=. input.concave_points_worst
              • input.concave_points_worst <=. 7/20
              "benign"
              510
              • not (((input.concavity_mean -. 111/1250) /. 1993/25000) <=. 10459/100000)
              • ((input.concave_points_worst -. 11461/100000) /. 6573/100000) <=. 5861/12500
              • not (((input.radius_se -. 40517/100000) /. 27731/100000) <=. 20821/50000)
              • ((input.concavity_worst -. 27219/100000) /. 10431/50000) <=. -5799/100000
              • ((input.radius_mean -. 1412729/100000) /. 70481/20000) <=. 17131/25000
              • ((input.compactness_mean -. 5217/50000) /. 5281/100000) <=. -4159/5000
              • not (((input.concavity_mean -. 111/1250) /. 1993/25000) <=. -2163/20000)
              • not
                (((input.concave_points_worst -. 11461/100000) /. 6573/100000) <=. 2453/25000)
              • not (((input.concavity_mean -. 111/1250) /. 1993/25000) <=. 265413/100000)
              • 5 <=. input.radius_mean
              • input.radius_mean <=. 35
              • 0 <=. input.compactness_mean
              • input.compactness_mean <=. 2/5
              • 0 <=. input.concavity_mean
              • input.concavity_mean <=. 1/2
              • 0 <=. input.radius_se
              • input.radius_se <=. 7/2
              • 0 <=. input.compactness_worst
              • input.compactness_worst <=. 6/5
              • 0 <=. input.concavity_worst
              • input.concavity_worst <=. 3/2
              • 0 <=. input.concave_points_worst
              • input.concave_points_worst <=. 7/20
              "benign"
              509
              • not (((input.concavity_mean -. 111/1250) /. 1993/25000) <=. 10459/100000)
              • ((input.concave_points_worst -. 11461/100000) /. 6573/100000) <=. 5861/12500
              • ((input.radius_se -. 40517/100000) /. 27731/100000) <=. 20821/50000
              • ((input.concavity_worst -. 27219/100000) /. 10431/50000) <=. -5799/100000
              • ((input.radius_mean -. 1412729/100000) /. 70481/20000) <=. 17131/25000
              • ((input.compactness_mean -. 5217/50000) /. 5281/100000) <=. -4159/5000
              • not (((input.concavity_mean -. 111/1250) /. 1993/25000) <=. -2163/20000)
              • not
                (((input.concave_points_worst -. 11461/100000) /. 6573/100000) <=. 2453/25000)
              • not (((input.concavity_mean -. 111/1250) /. 1993/25000) <=. 265413/100000)
              • 5 <=. input.radius_mean
              • input.radius_mean <=. 35
              • 0 <=. input.compactness_mean
              • input.compactness_mean <=. 2/5
              • 0 <=. input.concavity_mean
              • input.concavity_mean <=. 1/2
              • 0 <=. input.radius_se
              • input.radius_se <=. 7/2
              • 0 <=. input.compactness_worst
              • input.compactness_worst <=. 6/5
              • 0 <=. input.concavity_worst
              • input.concavity_worst <=. 3/2
              • 0 <=. input.concave_points_worst
              • input.concave_points_worst <=. 7/20
              "benign"
              508
              • ((input.concavity_mean -. 111/1250) /. 1993/25000) <=. 10459/100000
              • not
                (((input.concavity_worst -. 27219/100000) /. 10431/50000) <=. -7603/20000)
              • not
                (((input.concave_points_worst -. 11461/100000) /. 6573/100000) <=.
                 -7927/100000)
              • ((input.concavity_worst -. 27219/100000) /. 10431/50000) <=. -5799/100000
              • ((input.radius_mean -. 1412729/100000) /. 70481/20000) <=. 17131/25000
              • ((input.compactness_mean -. 5217/50000) /. 5281/100000) <=. -4159/5000
              • not (((input.concavity_mean -. 111/1250) /. 1993/25000) <=. -2163/20000)
              • not
                (((input.concave_points_worst -. 11461/100000) /. 6573/100000) <=. 2453/25000)
              • not (((input.concavity_mean -. 111/1250) /. 1993/25000) <=. 265413/100000)
              • 5 <=. input.radius_mean
              • input.radius_mean <=. 35
              • 0 <=. input.compactness_mean
              • input.compactness_mean <=. 2/5
              • 0 <=. input.concavity_mean
              • input.concavity_mean <=. 1/2
              • 0 <=. input.radius_se
              • input.radius_se <=. 7/2
              • 0 <=. input.compactness_worst
              • input.compactness_worst <=. 6/5
              • 0 <=. input.concavity_worst
              • input.concavity_worst <=. 3/2
              • 0 <=. input.concave_points_worst
              • input.concave_points_worst <=. 7/20
              "benign"
              507
              • ((input.concavity_mean -. 111/1250) /. 1993/25000) <=. 10459/100000
              • not
                (((input.concavity_worst -. 27219/100000) /. 10431/50000) <=. -7603/20000)
              • ((input.concave_points_worst -. 11461/100000) /. 6573/100000) <=.
                -7927/100000
              • ((input.concavity_worst -. 27219/100000) /. 10431/50000) <=. -5799/100000
              • ((input.radius_mean -. 1412729/100000) /. 70481/20000) <=. 17131/25000
              • ((input.compactness_mean -. 5217/50000) /. 5281/100000) <=. -4159/5000
              • not (((input.concavity_mean -. 111/1250) /. 1993/25000) <=. -2163/20000)
              • not
                (((input.concave_points_worst -. 11461/100000) /. 6573/100000) <=. 2453/25000)
              • not (((input.concavity_mean -. 111/1250) /. 1993/25000) <=. 265413/100000)
              • 5 <=. input.radius_mean
              • input.radius_mean <=. 35
              • 0 <=. input.compactness_mean
              • input.compactness_mean <=. 2/5
              • 0 <=. input.concavity_mean
              • input.concavity_mean <=. 1/2
              • 0 <=. input.radius_se
              • input.radius_se <=. 7/2
              • 0 <=. input.compactness_worst
              • input.compactness_worst <=. 6/5
              • 0 <=. input.concavity_worst
              • input.concavity_worst <=. 3/2
              • 0 <=. input.concave_points_worst
              • input.concave_points_worst <=. 7/20
              "benign"
              506
              • ((input.concavity_mean -. 111/1250) /. 1993/25000) <=. 10459/100000
              • ((input.concavity_worst -. 27219/100000) /. 10431/50000) <=. -7603/20000
              • not
                (((input.concavity_worst -. 27219/100000) /. 10431/50000) <=. -60659/100000)
              • ((input.concavity_worst -. 27219/100000) /. 10431/50000) <=. -5799/100000
              • ((input.radius_mean -. 1412729/100000) /. 70481/20000) <=. 17131/25000
              • ((input.compactness_mean -. 5217/50000) /. 5281/100000) <=. -4159/5000
              • not (((input.concavity_mean -. 111/1250) /. 1993/25000) <=. -2163/20000)
              • not
                (((input.concave_points_worst -. 11461/100000) /. 6573/100000) <=. 2453/25000)
              • not (((input.concavity_mean -. 111/1250) /. 1993/25000) <=. 265413/100000)
              • 5 <=. input.radius_mean
              • input.radius_mean <=. 35
              • 0 <=. input.compactness_mean
              • input.compactness_mean <=. 2/5
              • 0 <=. input.concavity_mean
              • input.concavity_mean <=. 1/2
              • 0 <=. input.radius_se
              • input.radius_se <=. 7/2
              • 0 <=. input.compactness_worst
              • input.compactness_worst <=. 6/5
              • 0 <=. input.concavity_worst
              • input.concavity_worst <=. 3/2
              • 0 <=. input.concave_points_worst
              • input.concave_points_worst <=. 7/20
              "benign"
              505
              • ((input.concavity_mean -. 111/1250) /. 1993/25000) <=. 10459/100000
              • ((input.concavity_worst -. 27219/100000) /. 10431/50000) <=. -7603/20000
              • ((input.concavity_worst -. 27219/100000) /. 10431/50000) <=. -60659/100000
              • ((input.concavity_worst -. 27219/100000) /. 10431/50000) <=. -5799/100000
              • ((input.radius_mean -. 1412729/100000) /. 70481/20000) <=. 17131/25000
              • ((input.compactness_mean -. 5217/50000) /. 5281/100000) <=. -4159/5000
              • not (((input.concavity_mean -. 111/1250) /. 1993/25000) <=. -2163/20000)
              • not
                (((input.concave_points_worst -. 11461/100000) /. 6573/100000) <=. 2453/25000)
              • not (((input.concavity_mean -. 111/1250) /. 1993/25000) <=. 265413/100000)
              • 5 <=. input.radius_mean
              • input.radius_mean <=. 35
              • 0 <=. input.compactness_mean
              • input.compactness_mean <=. 2/5
              • 0 <=. input.concavity_mean
              • input.concavity_mean <=. 1/2
              • 0 <=. input.radius_se
              • input.radius_se <=. 7/2
              • 0 <=. input.compactness_worst
              • input.compactness_worst <=. 6/5
              • 0 <=. input.concavity_worst
              • input.concavity_worst <=. 3/2
              • 0 <=. input.concave_points_worst
              • input.concave_points_worst <=. 7/20
              "benign"
              504
              • not (((input.concavity_mean -. 111/1250) /. 1993/25000) <=. 10459/100000)
              • not
                (((input.concave_points_worst -. 11461/100000) /. 6573/100000) <=. 5861/12500)
              • not (((input.concavity_mean -. 111/1250) /. 1993/25000) <=. 174327/100000)
              • ((input.concavity_worst -. 27219/100000) /. 10431/50000) <=. -5799/100000
              • ((input.radius_mean -. 1412729/100000) /. 70481/20000) <=. 17131/25000
              • not (((input.compactness_mean -. 5217/50000) /. 5281/100000) <=. -4159/5000)
              • not (((input.concavity_mean -. 111/1250) /. 1993/25000) <=. -2163/20000)
              • not
                (((input.concave_points_worst -. 11461/100000) /. 6573/100000) <=. 2453/25000)
              • not (((input.concavity_mean -. 111/1250) /. 1993/25000) <=. 265413/100000)
              • 5 <=. input.radius_mean
              • input.radius_mean <=. 35
              • 0 <=. input.compactness_mean
              • input.compactness_mean <=. 2/5
              • 0 <=. input.concavity_mean
              • input.concavity_mean <=. 1/2
              • 0 <=. input.radius_se
              • input.radius_se <=. 7/2
              • 0 <=. input.compactness_worst
              • input.compactness_worst <=. 6/5
              • 0 <=. input.concavity_worst
              • input.concavity_worst <=. 3/2
              • 0 <=. input.concave_points_worst
              • input.concave_points_worst <=. 7/20
              "benign"
              503
              • not (((input.concavity_mean -. 111/1250) /. 1993/25000) <=. 10459/100000)
              • not
                (((input.concave_points_worst -. 11461/100000) /. 6573/100000) <=. 5861/12500)
              • ((input.concavity_mean -. 111/1250) /. 1993/25000) <=. 174327/100000
              • ((input.concavity_worst -. 27219/100000) /. 10431/50000) <=. -5799/100000
              • ((input.radius_mean -. 1412729/100000) /. 70481/20000) <=. 17131/25000
              • not (((input.compactness_mean -. 5217/50000) /. 5281/100000) <=. -4159/5000)
              • not (((input.concavity_mean -. 111/1250) /. 1993/25000) <=. -2163/20000)
              • not
                (((input.concave_points_worst -. 11461/100000) /. 6573/100000) <=. 2453/25000)
              • not (((input.concavity_mean -. 111/1250) /. 1993/25000) <=. 265413/100000)
              • 5 <=. input.radius_mean
              • input.radius_mean <=. 35
              • 0 <=. input.compactness_mean
              • input.compactness_mean <=. 2/5
              • 0 <=. input.concavity_mean
              • input.concavity_mean <=. 1/2
              • 0 <=. input.radius_se
              • input.radius_se <=. 7/2
              • 0 <=. input.compactness_worst
              • input.compactness_worst <=. 6/5
              • 0 <=. input.concavity_worst
              • input.concavity_worst <=. 3/2
              • 0 <=. input.concave_points_worst
              • input.concave_points_worst <=. 7/20
              "benign"
              502
              • not (((input.concavity_mean -. 111/1250) /. 1993/25000) <=. 10459/100000)
              • ((input.concave_points_worst -. 11461/100000) /. 6573/100000) <=. 5861/12500
              • not (((input.radius_se -. 40517/100000) /. 27731/100000) <=. 20821/50000)
              • ((input.concavity_worst -. 27219/100000) /. 10431/50000) <=. -5799/100000
              • ((input.radius_mean -. 1412729/100000) /. 70481/20000) <=. 17131/25000
              • not (((input.compactness_mean -. 5217/50000) /. 5281/100000) <=. -4159/5000)
              • not (((input.concavity_mean -. 111/1250) /. 1993/25000) <=. -2163/20000)
              • not
                (((input.concave_points_worst -. 11461/100000) /. 6573/100000) <=. 2453/25000)
              • not (((input.concavity_mean -. 111/1250) /. 1993/25000) <=. 265413/100000)
              • 5 <=. input.radius_mean
              • input.radius_mean <=. 35
              • 0 <=. input.compactness_mean
              • input.compactness_mean <=. 2/5
              • 0 <=. input.concavity_mean
              • input.concavity_mean <=. 1/2
              • 0 <=. input.radius_se
              • input.radius_se <=. 7/2
              • 0 <=. input.compactness_worst
              • input.compactness_worst <=. 6/5
              • 0 <=. input.concavity_worst
              • input.concavity_worst <=. 3/2
              • 0 <=. input.concave_points_worst
              • input.concave_points_worst <=. 7/20
              "benign"
              501
              • not (((input.concavity_mean -. 111/1250) /. 1993/25000) <=. 10459/100000)
              • ((input.concave_points_worst -. 11461/100000) /. 6573/100000) <=. 5861/12500
              • ((input.radius_se -. 40517/100000) /. 27731/100000) <=. 20821/50000
              • ((input.concavity_worst -. 27219/100000) /. 10431/50000) <=. -5799/100000
              • ((input.radius_mean -. 1412729/100000) /. 70481/20000) <=. 17131/25000
              • not (((input.compactness_mean -. 5217/50000) /. 5281/100000) <=. -4159/5000)
              • not (((input.concavity_mean -. 111/1250) /. 1993/25000) <=. -2163/20000)
              • not
                (((input.concave_points_worst -. 11461/100000) /. 6573/100000) <=. 2453/25000)
              • not (((input.concavity_mean -. 111/1250) /. 1993/25000) <=. 265413/100000)
              • 5 <=. input.radius_mean
              • input.radius_mean <=. 35
              • 0 <=. input.compactness_mean
              • input.compactness_mean <=. 2/5
              • 0 <=. input.concavity_mean
              • input.concavity_mean <=. 1/2
              • 0 <=. input.radius_se
              • input.radius_se <=. 7/2
              • 0 <=. input.compactness_worst
              • input.compactness_worst <=. 6/5
              • 0 <=. input.concavity_worst
              • input.concavity_worst <=. 3/2
              • 0 <=. input.concave_points_worst
              • input.concave_points_worst <=. 7/20
              "benign"
              500
              • ((input.concavity_mean -. 111/1250) /. 1993/25000) <=. 10459/100000
              • not
                (((input.concavity_worst -. 27219/100000) /. 10431/50000) <=. -7603/20000)
              • not
                (((input.concave_points_worst -. 11461/100000) /. 6573/100000) <=.
                 -7927/100000)
              • ((input.concavity_worst -. 27219/100000) /. 10431/50000) <=. -5799/100000
              • ((input.radius_mean -. 1412729/100000) /. 70481/20000) <=. 17131/25000
              • not (((input.compactness_mean -. 5217/50000) /. 5281/100000) <=. -4159/5000)
              • not (((input.concavity_mean -. 111/1250) /. 1993/25000) <=. -2163/20000)
              • not
                (((input.concave_points_worst -. 11461/100000) /. 6573/100000) <=. 2453/25000)
              • not (((input.concavity_mean -. 111/1250) /. 1993/25000) <=. 265413/100000)
              • 5 <=. input.radius_mean
              • input.radius_mean <=. 35
              • 0 <=. input.compactness_mean
              • input.compactness_mean <=. 2/5
              • 0 <=. input.concavity_mean
              • input.concavity_mean <=. 1/2
              • 0 <=. input.radius_se
              • input.radius_se <=. 7/2
              • 0 <=. input.compactness_worst
              • input.compactness_worst <=. 6/5
              • 0 <=. input.concavity_worst
              • input.concavity_worst <=. 3/2
              • 0 <=. input.concave_points_worst
              • input.concave_points_worst <=. 7/20
              "benign"
              499
              • ((input.concavity_mean -. 111/1250) /. 1993/25000) <=. 10459/100000
              • not
                (((input.concavity_worst -. 27219/100000) /. 10431/50000) <=. -7603/20000)
              • ((input.concave_points_worst -. 11461/100000) /. 6573/100000) <=.
                -7927/100000
              • ((input.concavity_worst -. 27219/100000) /. 10431/50000) <=. -5799/100000
              • ((input.radius_mean -. 1412729/100000) /. 70481/20000) <=. 17131/25000
              • not (((input.compactness_mean -. 5217/50000) /. 5281/100000) <=. -4159/5000)
              • not (((input.concavity_mean -. 111/1250) /. 1993/25000) <=. -2163/20000)
              • not
                (((input.concave_points_worst -. 11461/100000) /. 6573/100000) <=. 2453/25000)
              • not (((input.concavity_mean -. 111/1250) /. 1993/25000) <=. 265413/100000)
              • 5 <=. input.radius_mean
              • input.radius_mean <=. 35
              • 0 <=. input.compactness_mean
              • input.compactness_mean <=. 2/5
              • 0 <=. input.concavity_mean
              • input.concavity_mean <=. 1/2
              • 0 <=. input.radius_se
              • input.radius_se <=. 7/2
              • 0 <=. input.compactness_worst
              • input.compactness_worst <=. 6/5
              • 0 <=. input.concavity_worst
              • input.concavity_worst <=. 3/2
              • 0 <=. input.concave_points_worst
              • input.concave_points_worst <=. 7/20
              "benign"
              498
              • ((input.concavity_mean -. 111/1250) /. 1993/25000) <=. 10459/100000
              • ((input.concavity_worst -. 27219/100000) /. 10431/50000) <=. -7603/20000
              • not
                (((input.concavity_worst -. 27219/100000) /. 10431/50000) <=. -60659/100000)
              • ((input.concavity_worst -. 27219/100000) /. 10431/50000) <=. -5799/100000
              • ((input.radius_mean -. 1412729/100000) /. 70481/20000) <=. 17131/25000
              • not (((input.compactness_mean -. 5217/50000) /. 5281/100000) <=. -4159/5000)
              • not (((input.concavity_mean -. 111/1250) /. 1993/25000) <=. -2163/20000)
              • not
                (((input.concave_points_worst -. 11461/100000) /. 6573/100000) <=. 2453/25000)
              • not (((input.concavity_mean -. 111/1250) /. 1993/25000) <=. 265413/100000)
              • 5 <=. input.radius_mean
              • input.radius_mean <=. 35
              • 0 <=. input.compactness_mean
              • input.compactness_mean <=. 2/5
              • 0 <=. input.concavity_mean
              • input.concavity_mean <=. 1/2
              • 0