An amazing feature of Imandra is the ability to reason about functional inputs and generate functional instances and counterexamples. For example we can introduce a variant type and a second-order function that evaluates its argument on each variant and computes the sum of the results.

In [1]:
type t = A | B | C
let sum = (f) => f(A) + f(B) + f(C)
Out[1]:
type t = A | B | C
val sum : (t -> Z.t) -> Z.t = <fun>

Now, let's ask Imandra to come up with an instance of a function f that returns positive values and that produces 42 when calling sum(f):

In [2]:
instance( (f) => f(A) > 0 && f(B) > 0 && sum(f) == 42 )
Out[2]:
- : (t -> int) -> bool = <fun>
module CX : sig val f : t -> Z.t end
Instance (after 0 steps, 0.019s):
 let f = function
   | C -> 40
   | _ -> 1
 
Instance
proof attempt
ground_instances0
definitions0
inductions0
search_time
0.019s
details
Expand
smt_stats
num checks1
arith assert lower3
arith pivots1
rlimit count235
mk clause3
datatype occurs check3
mk bool var11
arith assert upper1
propagations2
datatype accessor ax3
num allocs1363287985
final checks1
added eqs1
del clause3
arith eq adapter1
memory21.810000
max memory24.750000
Expand
  • start[0.019s]
      sko_f_0 A > 0 && sko_f_0 B > 0 && (sko_f_0 A + sko_f_0 B) + sko_f_0 C = 42
  • simplify

    into
    (not (sko_f_0 A <= 0) && not (sko_f_0 B <= 0))
    && (sko_f_0 A + sko_f_0 B) + sko_f_0 C = 42
    expansions
    []
    rewrite_steps
      forward_chaining
      • Sat (Some let f = function | C -> 40 | _ -> 1 )

      A natural application of this ability is to apply it to formal reasoning about random variables and probability distributions over them. In the following, we'll consider an application of this kind of reasoning to analyse the famous Monty Hall problem.

      Monty Hall problem

      In the Monty Hall problem, you are participating in a game where you are given three doors to choose from. One of the doors randomly contains a $1,000,000 prize. Let’s encode that in ReasonML — we’ll introduce the value of the prize and the variant type for the doors.

      In [3]:
      let prize = 1000000.0
      
      type door =
        | DoorA
        | DoorB
        | DoorC
      
      Out[3]:
      val prize : Q.t = 1000000
      type door = DoorA | DoorB | DoorC
      

      The game then proceeds as follows: first, you guess one of the doors, then the host opens another door — a door that, the host knows, doesn’t contain the prize. The host then offers you to change your initial guess or stay with it.

      To encode that, we’ll introduce another variant type for the player choice and gather all the values that describe a single Monty Hall trial into a scenario record.

      In [4]:
      type choice =
        | Stay
        | Swap
      
      type scenario = 
        { prize       : door
        , first_guess : door
        , choice      : choice
        }
      
      Out[4]:
      type choice = Stay | Swap
      type scenario = { prize : door; first_guess : door; choice : choice; }
      

      Probability mass functions 

      We want to treat scenarios described by the "scenario" record as random. This means, that we would like to deal with probability mass functions (PMFs) over the variables of the declared types. As a first example lets make a PMF for the doors that distribute over the three doors with equal probabilities:

      In [5]:
      let pDoorsEqual : (door => real) = (door) => Real.(1.0 / 3.0)
      
      Out[5]:
      val pDoorsEqual : door -> real = <fun>
      

      Notice that we are using the real type for numeric values in our examples. Internally, Imandra represents reals as arbitrary-precision rational numbers. The real-valued arithmetic operators are gathered in the Real module. Further, we’ll be explicitly opening the Real module to stress that we are dealing with the rational arithmetic.

      Not any function can be a probability mass function — PMFs must satisfy the so-called “measure axioms”: all the returned values must be positive and the total probability over the whole function domain must sum to one. Let’s make a predicate that checks whether a function is a valid PMF over doors.

      In [6]:
      let valid_door_pmf : (door => real) => bool =  
        (pmf) => Real.(
          (pmf(DoorA) + pmf(DoorB) + pmf(DoorC) == 1.0) &&
          (pmf(DoorA) >= 0.0) &&
          (pmf(DoorB) >= 0.0) &&
          (pmf(DoorC) >= 0.0)  
        )
      
      Out[6]:
      val valid_door_pmf : (door -> real) -> bool = <fun>
      

      To check that this predicate works we use it on our pDoorsEqual function.

      In [7]:
      valid_door_pmf(pDoorsEqual)
      
      Out[7]:
      - : bool = true
      

      The player controls the first_guess and the choice fields of the scenario record. In general, the player can also choose his actions randomly. Following the game theory nomenclature, we’ll call the probability distribution over the player choices a “strategy”. For example, here is a strategy that chooses the first guess door randomly and then stays with that choice:

      In [8]:
      let random_then_stay_strategy : (door => choice => real) =
        (door) => (choice) => Real.(
          switch(door,choice){
          | ( _ ,Stay) => 1.0 / 3.0
          | ( _ ,Swap) => 0.0 
          } 
        )
      
      Out[8]:
      val random_then_stay_strategy : door -> choice -> real = <fun>
      

      A strategy’s type is door => choice => real: it takes in which door the player chooses and which choice he makes and returns the probability with which that happens. To check that such a function is a valid probability mass function we introduce and apply a predicate for it:

      In [9]:
      let valid_strategy : (door => choice => real) => bool = 
        (strategy) => Real.({  
          (strategy(DoorA,Stay) >= 0.0) && (strategy(DoorA,Swap) >= 0.0) &&
          (strategy(DoorB,Stay) >= 0.0) && (strategy(DoorB,Swap) >= 0.0) &&
          (strategy(DoorC,Stay) >= 0.0) && (strategy(DoorC,Swap) >= 0.0) &&
          ( strategy(DoorA,Stay) + strategy(DoorA,Swap) +
            strategy(DoorB,Stay) + strategy(DoorB,Swap) +
            strategy(DoorC,Stay) + strategy(DoorC,Swap) == 1.0) 
        });
      
      Out[9]:
      val valid_strategy : (door -> choice -> real) -> bool = <fun>
      
      In [10]:
      valid_strategy(random_then_stay_strategy)
      
      Out[10]:
      - : bool = true
      

      Finally, we’ll need one last function that constructs a PMF over the whole scenario record. It shall take a door => real prize PMF, a door => choice => real player strategy and return a scenario => real PMF over the scenarios. Since we consider the host and the player choices to be independent, the total probability of the full scenario must be the product of the probabilities of the host and player choices:

      In [11]:
      let make_scenario_pmf : (door => real) => (door => choice => real) => (scenario => real) = 
        (door_pmf) => (strategy) => (s) => Real.({
          door_pmf(s.prize) * strategy(s.first_guess, s.choice)
        })
      
      Out[11]:
      val make_scenario_pmf :
        (door -> real) -> (door -> choice -> real) -> scenario -> real = <fun>
      

      Expected rewards

      Given a scenario variable, we calculate a reward that the player gets. When the first guess is equal to the prize location, then we return the prize value if the player decided to stay. Alternatively, when the first guess is not equal to the prize location, then we return the prize value if the player decided to swap.

      In [12]:
      let reward = (scenario) =>
        if( scenario.prize == scenario.first_guess ) { 
            if(scenario.choice == Stay) prize else 0.0;
        } else { 
            if(scenario.choice == Swap) prize else 0.0;
        }
      
      Out[12]:
      val reward : scenario -> Q.t = <fun>
      

      Given a probability distribution over scenarios, we want to calculate the expected value of the reward - the probability-weighted average of possible values of the reward for each outcome. $$ E[reward] = \sum_s P(s)\,reward(s) $$ We create a function that performs this averaging: it takes in a scenario PMF of type scenario => real and returns a real value for the expectation value:

      In [13]:
      let expected_reward : (scenario => real) => real = 
        (scenario_pmf) => Real.({
          let pr = (s) => scenario_pmf(s) * reward(s);
            let avg = (choice) => {
              let avg = (first_guess) => {
                pr { prize: DoorA, first_guess, choice} +
                pr { prize: DoorB, first_guess, choice} +
                pr { prize: DoorC, first_guess, choice} 
              };
            avg(DoorA) + avg(DoorB) + avg(DoorC) 
            };
          avg(Stay) + avg(Swap)
        })
      
      Out[13]:
      val expected_reward : (scenario -> real) -> real = <fun>
      

      As a final example, let’s use what we’ve built so far to calculate the expected reward for the scenario when the host hides the prize equally randomly and the player uses the random_then_stay_strategy:

      In [14]:
      expected_reward(
        make_scenario_pmf(pDoorsEqual, random_then_stay_strategy)
      )
      
      Out[14]:
      - : real = 1000000/3
      

      We get an obvious result that if we choose a random door and stay with it, then the expected reward is 1/3 of a $1,000,000. That seems reasonable and it is not very obvious that one can increase his chances of winning.

      Reasoning about Monty Hall probabilities

      Let’s ask Imandra if 1/3 of a 1,000,000$ is the best outcome for any strategy the player can employ. To do this we’ll use Imandra verify directive to verify that, for any valid strategy, the scenario with equally probable prize placement gives an expected reward less or equal than 1000000/3.

      In [15]:
      verify ( (strategy) => Real.({
        valid_strategy(strategy) ==>
        ( expected_reward (
            make_scenario_pmf(pDoorsEqual,strategy)
        ) <= 1000000.0 / 3.0 ) 
      }))
      
      Out[15]:
      - : (door -> choice -> real) -> bool = <fun>
      module CX : sig val strategy : door -> choice -> Q.t end
      
      Counterexample (after 0 steps, 0.019s):
       let strategy x_0 x_1 =
         match (x_0, x_1) with
         | (DoorA, Swap) -> (Real.mk_of_string "3/1000000")
         | (DoorB, Stay) -> (Real.mk_of_string "999997/1000000")
         | _ -> 0.
       
      
      Refuted
      proof attempt
      ground_instances0
      definitions0
      inductions0
      search_time
      0.019s
      details
      Expand
      smt_stats
      num checks1
      arith assert lower8
      arith pivots2
      rlimit count1557
      mk clause3
      datatype occurs check5
      mk bool var16
      arith assert upper1
      arith add rows2
      propagations2
      datatype accessor ax5
      num allocs1446417464
      final checks1
      added eqs1
      del clause3
      arith eq adapter1
      memory22.130000
      max memory24.750000
      Expand
      • start[0.019s]
          sko_strategy_0 DoorA Stay >=. 0
          && sko_strategy_0 DoorA Swap >=. 0
             && sko_strategy_0 DoorB Stay >=. 0
                && sko_strategy_0 DoorB Swap >=. 0
                   && sko_strategy_0 DoorC Stay >=. 0
                      && sko_strategy_0 DoorC Swap >=. 0
                         && ((((sko_strategy_0 DoorA Stay
                                +. sko_strategy_0 DoorA Swap)
                               +. sko_strategy_0 DoorB Stay)
                              +. sko_strategy_0 DoorB Swap)
                             +. sko_strategy_0 DoorC Stay)
                            +. sko_strategy_0 DoorC Swap = 1
          ==> ((((((1 /. 3 *. sko_strategy_0 DoorA Stay)
                   *. (if DoorA = DoorA then if Stay = Stay then 1000000 else 0
                       else if Stay = Swap then 1000000 else 0)
                   +. (1 /. 3 *. sko_strategy_0 DoorA Stay)
                      *. (if DoorB = DoorA then if Stay = Stay then 1000000 else 0
                          else if … then 1000000 else 0))
                  +. …)
                 +. …)
                +. …)
               +. …)
              <=. …
      • simplify

        into
        not
        ((((((sko_strategy_0 DoorA Stay >=. 0 && sko_strategy_0 DoorA Swap >=. 0)
             && sko_strategy_0 DoorB Stay >=. 0)
            && sko_strategy_0 DoorB Swap >=. 0)
           && sko_strategy_0 DoorC Stay >=. 0)
          && sko_strategy_0 DoorC Swap >=. 0)
         && ((((sko_strategy_0 DoorA Stay +. sko_strategy_0 DoorA Swap)
               +. sko_strategy_0 DoorB Stay)
              +. sko_strategy_0 DoorB Swap)
             +. sko_strategy_0 DoorC Stay)
            +. sko_strategy_0 DoorC Swap = 1)
        || (((((1000000/3 *. sko_strategy_0 DoorA Stay
                +. 1000000/3 *. sko_strategy_0 DoorB Stay)
               +. 1000000/3 *. sko_strategy_0 DoorC Stay)
              +. 2000000/3 *. sko_strategy_0 DoorA Swap)
             +. 2000000/3 *. sko_strategy_0 DoorB Swap)
            +. 2000000/3 *. sko_strategy_0 DoorC Swap)
           <=. 1000000/3
        expansions
        []
        rewrite_steps
          forward_chaining
          • Sat (Some let strategy x_0 x_1 = match (x_0, x_1) with | (DoorA, Swap) -> (Real.mk_of_string "3/1000000") | (DoorB, Stay) -> (Real.mk_of_string "999997/1000000") | _ -> 0. )

          Imandra refuted our statement and provided us with a counterexample strategy. Imandra suggests that the expected reward will be greater than 1000000/3 if the player chooses DoorA and then Swaps his choice in 0.0003% of the cases. The counterexample strategy is defined in the CX module, so let's see what is the expected reward with this strategy:

          In [16]:
          expected_reward(
            make_scenario_pmf(pDoorsEqual, CX.strategy)
          )
          
          Out[16]:
          - : real = 1000003/3
          

          The counterexample strategy that Imandra produced increased our expected winnings by 1$. This counterexample strategy employs swapping instead of staying. What would be our expected reward if we always swap?

          In [17]:
          let random_then_swap_strategy = (door : door,choice) => 
              switch(door,choice){ | (_,Swap) => Real.(1.0 / 3.0) | _ => 0.0 };
          expected_reward(
            make_scenario_pmf(pDoorsEqual,random_then_swap_strategy)
          )
          
          Out[17]:
          val random_then_swap_strategy : door -> choice -> real = <fun>
          - : real = 2000000/3
          

          By always swapping we’ve increased our chances of winning up to 2/3. Is this the best possible expected reward one might get in this game?

          In [18]:
          verify ( (strategy) => Real.({
            valid_strategy(strategy) ==>
            ( expected_reward (
                make_scenario_pmf(pDoorsEqual,strategy)
            ) <= 2000000.0 / 3.0 ) 
          }))
          
          Out[18]:
          - : (door -> choice -> real) -> bool = <fun>
          
          Proved
          proof
          ground_instances0
          definitions0
          inductions0
          search_time
          0.019s
          details
          Expand
          smt_stats
          num checks1
          arith assert lower8
          arith pivots2
          rlimit count1538
          mk clause3
          mk bool var16
          arith assert upper1
          arith add rows2
          propagations2
          conflicts1
          datatype accessor ax5
          arith conflicts1
          num allocs1530436838
          added eqs1
          arith eq adapter1
          memory25.120000
          max memory25.520000
          Expand
          • start[0.019s]
              sko_strategy_0 DoorA Stay >=. 0
              && sko_strategy_0 DoorA Swap >=. 0
                 && sko_strategy_0 DoorB Stay >=. 0
                    && sko_strategy_0 DoorB Swap >=. 0
                       && sko_strategy_0 DoorC Stay >=. 0
                          && sko_strategy_0 DoorC Swap >=. 0
                             && ((((sko_strategy_0 DoorA Stay
                                    +. sko_strategy_0 DoorA Swap)
                                   +. sko_strategy_0 DoorB Stay)
                                  +. sko_strategy_0 DoorB Swap)
                                 +. sko_strategy_0 DoorC Stay)
                                +. sko_strategy_0 DoorC Swap = 1
              ==> ((((((1 /. 3 *. sko_strategy_0 DoorA Stay)
                       *. (if DoorA = DoorA then if Stay = Stay then 1000000 else 0
                           else if Stay = Swap then 1000000 else 0)
                       +. (1 /. 3 *. sko_strategy_0 DoorA Stay)
                          *. (if DoorB = DoorA then if Stay = Stay then 1000000 else 0
                              else if … then 1000000 else 0))
                      +. …)
                     +. …)
                    +. …)
                   +. …)
                  <=. …
          • simplify

            into
            not
            ((((((sko_strategy_0 DoorA Stay >=. 0 && sko_strategy_0 DoorA Swap >=. 0)
                 && sko_strategy_0 DoorB Stay >=. 0)
                && sko_strategy_0 DoorB Swap >=. 0)
               && sko_strategy_0 DoorC Stay >=. 0)
              && sko_strategy_0 DoorC Swap >=. 0)
             && ((((sko_strategy_0 DoorA Stay +. sko_strategy_0 DoorA Swap)
                   +. sko_strategy_0 DoorB Stay)
                  +. sko_strategy_0 DoorB Swap)
                 +. sko_strategy_0 DoorC Stay)
                +. sko_strategy_0 DoorC Swap = 1)
            || (((((1000000/3 *. sko_strategy_0 DoorA Stay
                    +. 1000000/3 *. sko_strategy_0 DoorB Stay)
                   +. 1000000/3 *. sko_strategy_0 DoorC Stay)
                  +. 2000000/3 *. sko_strategy_0 DoorA Swap)
                 +. 2000000/3 *. sko_strategy_0 DoorB Swap)
                +. 2000000/3 *. sko_strategy_0 DoorC Swap)
               <=. 2000000/3
            expansions
            []
            rewrite_steps
              forward_chaining
              • unsat

                (let ((a!1 (+ (sko_strategy_0_2552 DoorA_2446 Stay_2450)
                              (sko_strategy_0_2552 DoorA_24…

              We’ve proven that the best possible expected reward in the Monty Hall game is 2/3 of a $1,000,000 in an (arguably, rather counterintuitive) result that one has to always swap his first choice to double his chances of winning.

              Suppose now that we don’t know the winning strategy, but we do know that it is possible to win in 2/3 of the cases. Can we ask Imandra to synthesise the optimal strategy? For this purpose we can use the instance directive:

              In [19]:
              instance ( (strategy) => Real.({
                valid_strategy(strategy) &&
                ( expected_reward (
                    make_scenario_pmf(pDoorsEqual,strategy)
                ) == 2000000.0 / 3.0 ) 
              }))
              
              Out[19]:
              - : (door -> choice -> real) -> bool = <fun>
              module CX : sig val strategy : door -> choice -> Q.t end
              
              Instance (after 0 steps, 0.024s):
               let strategy x_0 x_1 =
                 match (x_0, x_1) with
                 | (DoorA, Swap) -> 1.
                 | _ -> 0.
               
              
              Instance
              proof attempt
              ground_instances0
              definitions0
              inductions0
              search_time
              0.024s
              details
              Expand
              smt_stats
              num checks1
              arith assert lower8
              arith pivots2
              rlimit count1547
              mk clause6
              datatype occurs check5
              mk bool var18
              arith assert upper2
              arith add rows2
              propagations4
              datatype accessor ax5
              num allocs1621649565
              final checks1
              added eqs2
              del clause6
              arith eq adapter2
              memory28.160000
              max memory28.650000
              Expand
              • start[0.024s]
                  (sko_strategy_0 DoorA Stay >=. 0
                   && sko_strategy_0 DoorA Swap >=. 0
                      && sko_strategy_0 DoorB Stay >=. 0
                         && sko_strategy_0 DoorB Swap >=. 0
                            && sko_strategy_0 DoorC Stay >=. 0
                               && sko_strategy_0 DoorC Swap >=. 0
                                  && ((((sko_strategy_0 DoorA Stay
                                         +. sko_strategy_0 DoorA Swap)
                                        +. sko_strategy_0 DoorB Stay)
                                       +. sko_strategy_0 DoorB Swap)
                                      +. sko_strategy_0 DoorC Stay)
                                     +. sko_strategy_0 DoorC Swap = 1)
                  && (((((1 /. 3 *. sko_strategy_0 DoorA Stay)
                         *. (if DoorA = DoorA then if Stay = Stay then 1000000 else 0
                             else if Stay = Swap then 1000000 else 0)
                         +. (1 /. 3 *. sko_strategy_0 DoorA Stay)
                            *. (if DoorB = DoorA then if Stay = Stay then 1000000 else 0
                                else if … then 1000000 else 0))
                        +. …)
                       +. …)
                      +. …)
                     +. … = …
              • simplify

                into
                ((((((sko_strategy_0 DoorA Stay >=. 0 && sko_strategy_0 DoorA Swap >=. 0)
                     && sko_strategy_0 DoorB Stay >=. 0)
                    && sko_strategy_0 DoorB Swap >=. 0)
                   && sko_strategy_0 DoorC Stay >=. 0)
                  && sko_strategy_0 DoorC Swap >=. 0)
                 && ((((sko_strategy_0 DoorA Stay +. sko_strategy_0 DoorA Swap)
                       +. sko_strategy_0 DoorB Stay)
                      +. sko_strategy_0 DoorB Swap)
                     +. sko_strategy_0 DoorC Stay)
                    +. sko_strategy_0 DoorC Swap = 1)
                && ((((1000000/3 *. sko_strategy_0 DoorA Stay
                       +. 1000000/3 *. sko_strategy_0 DoorB Stay)
                      +. 1000000/3 *. sko_strategy_0 DoorC Stay)
                     +. 2000000/3 *. sko_strategy_0 DoorA Swap)
                    +. 2000000/3 *. sko_strategy_0 DoorB Swap)
                   +. 2000000/3 *. sko_strategy_0 DoorC Swap = 2000000/3
                expansions
                []
                rewrite_steps
                  forward_chaining
                  • Sat (Some let strategy x_0 x_1 = match (x_0, x_1) with | (DoorA, Swap) -> 1. | _ -> 0. )

                  We've got an example of an optimal strategy - this one doesn't randomize over doors, but suggests to always choose the DoorA and then swap.

                  Biased Monty Hall

                  Now let's apply what we've learned about the "standard" Monty Hall paradox to a more complicated game setup - suppose that we somehow know that the host is biased towards the DoorA -- the probability that the host will hide the prize behind the first door is 3 times higher than the individual probabilities for two other doors. Should one still always swap? Or, maybe, it makes more sense to always choose the more likely DoorA and stay?

                  We encode our biased host in the biased PMF over doors (also doing a sanity check that it is a valid probability distribution):

                  In [20]:
                  let pDoorsBiased = (door : door) => Real.(
                    switch (door){
                    | DoorA => 3.0 / 5.0
                    | DoorB | DoorC => 1.0 / 5.0
                    }
                  );
                  
                  valid_door_pmf(pDoorsBiased)
                  
                  Out[20]:
                  val pDoorsBiased : door -> real = <fun>
                  - : bool = true
                  

                  In the previous section, we've proven that the maximal possible expected reward in the "standard" Monty Hall problem is 2/3 of a million. Again, we ask Imandra whether we can improve on this in the biased case:

                  In [21]:
                  verify ( (strategy) => Real.({
                    valid_strategy(strategy) ==>
                    ( expected_reward (
                        make_scenario_pmf(pDoorsBiased,strategy)
                    ) <= 2000000.0 / 3.0 ) 
                  }))
                  
                  Out[21]:
                  - : (door -> choice -> real) -> bool = <fun>
                  module CX : sig val strategy : door -> choice -> Q.t end
                  
                  Counterexample (after 0 steps, 0.022s):
                   let strategy x_0 x_1 =
                     match (x_0, x_1) with
                     | (DoorA, Stay) -> (Real.mk_of_string "399997/600000")
                     | (DoorC, Swap) -> (Real.mk_of_string "200003/600000")
                     | _ -> 0.
                   
                  
                  Refuted
                  proof attempt
                  ground_instances0
                  definitions0
                  inductions0
                  search_time
                  0.022s
                  details
                  Expand
                  smt_stats
                  num checks1
                  arith assert lower8
                  arith pivots3
                  rlimit count1857
                  mk clause3
                  datatype occurs check5
                  mk bool var16
                  arith assert upper1
                  arith add rows3
                  propagations2
                  datatype accessor ax5
                  num allocs1714918070
                  final checks1
                  added eqs1
                  del clause3
                  arith eq adapter1
                  memory31.180000
                  max memory31.670000
                  Expand
                  • start[0.022s]
                      sko_strategy_0 DoorA Stay >=. 0
                      && sko_strategy_0 DoorA Swap >=. 0
                         && sko_strategy_0 DoorB Stay >=. 0
                            && sko_strategy_0 DoorB Swap >=. 0
                               && sko_strategy_0 DoorC Stay >=. 0
                                  && sko_strategy_0 DoorC Swap >=. 0
                                     && ((((sko_strategy_0 DoorA Stay
                                            +. sko_strategy_0 DoorA Swap)
                                           +. sko_strategy_0 DoorB Stay)
                                          +. sko_strategy_0 DoorB Swap)
                                         +. sko_strategy_0 DoorC Stay)
                                        +. sko_strategy_0 DoorC Swap = 1
                      ==> (((((((if DoorA = DoorA then 3 /. 5 else 1 /. 5)
                                *. sko_strategy_0 DoorA Stay)
                               *. (if DoorA = DoorA then if Stay = Stay then 1000000 else 0
                                   else if Stay = Swap then 1000000 else 0)
                               +. ((if DoorB = DoorA then 3 /. 5 else 1 /. 5)
                                   *. sko_strategy_0 DoorA Stay)
                                  *. (if … then … else …))
                              +. …)
                             +. …)
                            +. …)
                           +. …)
                          <=. …
                  • simplify

                    into
                    not
                    ((((((sko_strategy_0 DoorA Stay >=. 0 && sko_strategy_0 DoorA Swap >=. 0)
                         && sko_strategy_0 DoorB Stay >=. 0)
                        && sko_strategy_0 DoorB Swap >=. 0)
                       && sko_strategy_0 DoorC Stay >=. 0)
                      && sko_strategy_0 DoorC Swap >=. 0)
                     && ((((sko_strategy_0 DoorA Stay +. sko_strategy_0 DoorA Swap)
                           +. sko_strategy_0 DoorB Stay)
                          +. sko_strategy_0 DoorB Swap)
                         +. sko_strategy_0 DoorC Stay)
                        +. sko_strategy_0 DoorC Swap = 1)
                    || (((((600000 *. sko_strategy_0 DoorA Stay
                            +. 200000 *. sko_strategy_0 DoorB Stay)
                           +. 200000 *. sko_strategy_0 DoorC Stay)
                          +. 400000 *. sko_strategy_0 DoorA Swap)
                         +. 800000 *. sko_strategy_0 DoorB Swap)
                        +. 800000 *. sko_strategy_0 DoorC Swap)
                       <=. 2000000/3
                    expansions
                    []
                    rewrite_steps
                      forward_chaining
                      • Sat (Some let strategy x_0 x_1 = match (x_0, x_1) with | (DoorA, Stay) -> (Real.mk_of_string "399997/600000") | (DoorC, Swap) -> (Real.mk_of_string "200003/600000") | _ -> 0. )

                      Imandra suggests that we can do better than 2/3 if we use a mixed strategy of either choosing the DoorA and staying or choosing the DoorC and swapping. Let's test both of these pure strategies and see what would be the expected rewards for each:

                      In [22]:
                      let aStayStrategy = (door,choice) => 
                          switch(door,choice){ | (DoorA,Stay) => 1.0 | _ => 0.0 };
                      expected_reward(
                        make_scenario_pmf(pDoorsBiased,aStayStrategy)
                      );
                      let cSwapStrategy = (door,choice) => 
                          switch(door,choice){ | (DoorC,Swap) => 1.0 | _ => 0.0 };
                      expected_reward(
                        make_scenario_pmf(pDoorsBiased,cSwapStrategy)
                      );
                      
                      Out[22]:
                      val aStayStrategy : door -> choice -> Q.t = <fun>
                      - : real = 600000
                      val cSwapStrategy : door -> choice -> Q.t = <fun>
                      - : real = 800000
                      

                      Choosing the DoorC and then swapping increases the chances of winning up to 80%! Again, is this the best outcome one can get?

                      In [23]:
                      verify ( (strategy) => Real.({
                        valid_strategy(strategy) ==>
                        ( expected_reward (
                            make_scenario_pmf(pDoorsBiased,strategy)
                        ) <= 800000.0 ) 
                      }))
                      
                      Out[23]:
                      - : (door -> choice -> real) -> bool = <fun>
                      
                      Proved
                      proof
                      ground_instances0
                      definitions0
                      inductions0
                      search_time
                      0.021s
                      details
                      Expand
                      smt_stats
                      num checks1
                      arith assert lower8
                      arith pivots3
                      rlimit count1836
                      mk clause3
                      mk bool var16
                      arith assert upper1
                      arith add rows3
                      propagations2
                      conflicts1
                      datatype accessor ax5
                      arith conflicts1
                      num allocs1808824937
                      added eqs1
                      arith eq adapter1
                      memory34.190000
                      max memory34.590000
                      Expand
                      • start[0.021s]
                          sko_strategy_0 DoorA Stay >=. 0
                          && sko_strategy_0 DoorA Swap >=. 0
                             && sko_strategy_0 DoorB Stay >=. 0
                                && sko_strategy_0 DoorB Swap >=. 0
                                   && sko_strategy_0 DoorC Stay >=. 0
                                      && sko_strategy_0 DoorC Swap >=. 0
                                         && ((((sko_strategy_0 DoorA Stay
                                                +. sko_strategy_0 DoorA Swap)
                                               +. sko_strategy_0 DoorB Stay)
                                              +. sko_strategy_0 DoorB Swap)
                                             +. sko_strategy_0 DoorC Stay)
                                            +. sko_strategy_0 DoorC Swap = 1
                          ==> (((((((if DoorA = DoorA then 3 /. 5 else 1 /. 5)
                                    *. sko_strategy_0 DoorA Stay)
                                   *. (if DoorA = DoorA then if Stay = Stay then 1000000 else 0
                                       else if Stay = Swap then 1000000 else 0)
                                   +. ((if DoorB = DoorA then 3 /. 5 else 1 /. 5)
                                       *. sko_strategy_0 DoorA Stay)
                                      *. (if … then … else …))
                                  +. …)
                                 +. …)
                                +. …)
                               +. …)
                              <=. 800000
                      • simplify

                        into
                        not
                        ((((((sko_strategy_0 DoorA Stay >=. 0 && sko_strategy_0 DoorA Swap >=. 0)
                             && sko_strategy_0 DoorB Stay >=. 0)
                            && sko_strategy_0 DoorB Swap >=. 0)
                           && sko_strategy_0 DoorC Stay >=. 0)
                          && sko_strategy_0 DoorC Swap >=. 0)
                         && ((((sko_strategy_0 DoorA Stay +. sko_strategy_0 DoorA Swap)
                               +. sko_strategy_0 DoorB Stay)
                              +. sko_strategy_0 DoorB Swap)
                             +. sko_strategy_0 DoorC Stay)
                            +. sko_strategy_0 DoorC Swap = 1)
                        || (((((600000 *. sko_strategy_0 DoorA Stay
                                +. 200000 *. sko_strategy_0 DoorB Stay)
                               +. 200000 *. sko_strategy_0 DoorC Stay)
                              +. 400000 *. sko_strategy_0 DoorA Swap)
                             +. 800000 *. sko_strategy_0 DoorB Swap)
                            +. 800000 *. sko_strategy_0 DoorC Swap)
                           <=. 800000
                        expansions
                        []
                        rewrite_steps
                          forward_chaining
                          • unsat

                            (let ((a!1 (+ (* 600000.0 (sko_strategy_0_2662 DoorA_2446 Stay_2450))
                                          (* 200000.0 (sko…

                          And we've proven that the DoorC-Swap is the best strategy the player can employ in this setup. Intuitively the reason for that is that it is better to choose the least likely door, forcing the host to reveal more information about the prize location in the more likely doors.

                          Even more complex example

                          Using this approach one can go further and analyse even more complicated Monty-Hall-like games (for example, the prize value might depend on the door and/or player actions).

                          In [24]:
                          let pDoorsComplex = (door : door) => Real.(
                            switch (door){
                            | DoorA => 5.0 / 10.0
                            | DoorB => 3.0 / 10.0 
                            | DoorC => 2.0 / 10.0
                            }
                          );
                          
                          let reward_complex = (scenario) => {
                            let prize = switch(scenario.prize) {
                            | DoorA => 1000000.
                            | DoorB => 2000000.
                            | DoorC => 4000000.      
                            }; 
                            if( scenario.prize == scenario.first_guess ) { 
                                if(scenario.choice == Stay) prize else 0.0;
                            } else { 
                                if(scenario.choice == Swap) prize else 0.0;
                            }
                          };
                          
                          let expected_reward_complex : (scenario => real) => real = 
                            (scenario_pmf) => Real.({
                              let pr = (s) => scenario_pmf(s) * reward_complex(s);
                                let avg = (choice) => {
                                  let avg = (first_guess) => {
                                    pr { prize: DoorA, first_guess, choice} +
                                    pr { prize: DoorB, first_guess, choice} +
                                    pr { prize: DoorC, first_guess, choice} 
                                  };
                                avg(DoorA) + avg(DoorB) + avg(DoorC) 
                                };
                              avg(Stay) + avg(Swap)
                            })
                          
                          Out[24]:
                          val pDoorsComplex : door -> real = <fun>
                          val reward_complex : scenario -> Q.t = <fun>
                          val expected_reward_complex : (scenario -> real) -> real = <fun>
                          
                          In [25]:
                          verify ( (strategy) => Real.({
                            valid_strategy(strategy) ==>
                            ( expected_reward_complex (
                                make_scenario_pmf(pDoorsComplex,strategy)
                            ) <= 1400000.0 ) 
                          }))
                          
                          Out[25]:
                          - : (door -> choice -> real) -> bool = <fun>
                          
                          Proved
                          proof
                          ground_instances0
                          definitions0
                          inductions0
                          search_time
                          0.021s
                          details
                          Expand
                          smt_stats
                          num checks1
                          arith assert lower8
                          arith pivots2
                          rlimit count2238
                          mk clause3
                          mk bool var16
                          arith assert upper1
                          arith add rows2
                          propagations2
                          conflicts1
                          datatype accessor ax5
                          arith conflicts1
                          num allocs1910270770
                          added eqs1
                          arith eq adapter1
                          memory37.330000
                          max memory37.810000
                          Expand
                          • start[0.021s]
                              sko_strategy_0 DoorA Stay >=. 0
                              && sko_strategy_0 DoorA Swap >=. 0
                                 && sko_strategy_0 DoorB Stay >=. 0
                                    && sko_strategy_0 DoorB Swap >=. 0
                                       && sko_strategy_0 DoorC Stay >=. 0
                                          && sko_strategy_0 DoorC Swap >=. 0
                                             && ((((sko_strategy_0 DoorA Stay
                                                    +. sko_strategy_0 DoorA Swap)
                                                   +. sko_strategy_0 DoorB Stay)
                                                  +. sko_strategy_0 DoorB Swap)
                                                 +. sko_strategy_0 DoorC Stay)
                                                +. sko_strategy_0 DoorC Swap = 1
                              ==> (((((((if DoorA = DoorA then 5 /. 10
                                         else if DoorA = DoorB then 3 /. 10 else 2 /. 10)
                                        *. sko_strategy_0 DoorA Stay)
                                       *. (if DoorA = DoorA
                                           then
                                             if Stay = Stay
                                             then
                                               if DoorA = DoorA then 1000000
                                               else if DoorA = DoorB then 2000000 else 4000000
                                             else 0
                                           else
                                           if Stay = Swap then if DoorA = DoorA then 1000000 else …
                                           else 0)
                                       +. …)
                                      +. …)
                                     +. …)
                                    +. …)
                                   +. …)
                                  <=. 1400000
                          • simplify

                            into
                            not
                            ((((((sko_strategy_0 DoorA Stay >=. 0 && sko_strategy_0 DoorA Swap >=. 0)
                                 && sko_strategy_0 DoorB Stay >=. 0)
                                && sko_strategy_0 DoorB Swap >=. 0)
                               && sko_strategy_0 DoorC Stay >=. 0)
                              && sko_strategy_0 DoorC Swap >=. 0)
                             && ((((sko_strategy_0 DoorA Stay +. sko_strategy_0 DoorA Swap)
                                   +. sko_strategy_0 DoorB Stay)
                                  +. sko_strategy_0 DoorB Swap)
                                 +. sko_strategy_0 DoorC Stay)
                                +. sko_strategy_0 DoorC Swap = 1)
                            || (((((500000 *. sko_strategy_0 DoorA Stay
                                    +. 600000 *. sko_strategy_0 DoorB Stay)
                                   +. 800000 *. sko_strategy_0 DoorC Stay)
                                  +. 1400000 *. sko_strategy_0 DoorA Swap)
                                 +. 1300000 *. sko_strategy_0 DoorB Swap)
                                +. 1100000 *. sko_strategy_0 DoorC Swap)
                               <=. 1400000
                            expansions
                            []
                            rewrite_steps
                              forward_chaining
                              • unsat

                                (let ((a!1 (+ (sko_strategy_0_2734 DoorA_2446 Stay_2450)
                                              (sko_strategy_0_2734 DoorA_24…
                              In [26]:
                              instance ( (strategy) => Real.({
                                valid_strategy(strategy) &&
                                ( expected_reward_complex (
                                    make_scenario_pmf(pDoorsComplex,strategy)
                                ) == 1400000.0 ) 
                              }))
                              
                              Out[26]:
                              - : (door -> choice -> real) -> bool = <fun>
                              module CX : sig val strategy : door -> choice -> Q.t end
                              
                              Instance (after 0 steps, 0.023s):
                               let strategy x_0 x_1 =
                                 match (x_0, x_1) with
                                 | (DoorA, Swap) -> 1.
                                 | _ -> 0.
                               
                              
                              Instance
                              proof attempt
                              ground_instances0
                              definitions0
                              inductions0
                              search_time
                              0.023s
                              details
                              Expand
                              smt_stats
                              num checks1
                              arith assert lower8
                              arith pivots2
                              rlimit count2247
                              mk clause6
                              datatype occurs check5
                              mk bool var18
                              arith assert upper2
                              arith add rows2
                              propagations4
                              datatype accessor ax5
                              num allocs2015430140
                              final checks1
                              added eqs2
                              del clause6
                              arith eq adapter2
                              memory40.470000
                              max memory40.970000
                              Expand
                              • start[0.023s]
                                  (sko_strategy_0 DoorA Stay >=. 0
                                   && sko_strategy_0 DoorA Swap >=. 0
                                      && sko_strategy_0 DoorB Stay >=. 0
                                         && sko_strategy_0 DoorB Swap >=. 0
                                            && sko_strategy_0 DoorC Stay >=. 0
                                               && sko_strategy_0 DoorC Swap >=. 0
                                                  && ((((sko_strategy_0 DoorA Stay
                                                         +. sko_strategy_0 DoorA Swap)
                                                        +. sko_strategy_0 DoorB Stay)
                                                       +. sko_strategy_0 DoorB Swap)
                                                      +. sko_strategy_0 DoorC Stay)
                                                     +. sko_strategy_0 DoorC Swap = 1)
                                  && ((((((if DoorA = DoorA then 5 /. 10
                                           else if DoorA = DoorB then 3 /. 10 else 2 /. 10)
                                          *. sko_strategy_0 DoorA Stay)
                                         *. (if DoorA = DoorA
                                             then
                                               if Stay = Stay
                                               then
                                                 if DoorA = DoorA then 1000000
                                                 else if DoorA = DoorB then 2000000 else 4000000
                                               else 0
                                             else
                                             if Stay = Swap then if DoorA = DoorA then 1000000 else …
                                             else 0)
                                         +. …)
                                        +. …)
                                       +. …)
                                      +. …)
                                     +. … = 1400000
                              • simplify

                                into
                                ((((((sko_strategy_0 DoorA Stay >=. 0 && sko_strategy_0 DoorA Swap >=. 0)
                                     && sko_strategy_0 DoorB Stay >=. 0)
                                    && sko_strategy_0 DoorB Swap >=. 0)
                                   && sko_strategy_0 DoorC Stay >=. 0)
                                  && sko_strategy_0 DoorC Swap >=. 0)
                                 && ((((sko_strategy_0 DoorA Stay +. sko_strategy_0 DoorA Swap)
                                       +. sko_strategy_0 DoorB Stay)
                                      +. sko_strategy_0 DoorB Swap)
                                     +. sko_strategy_0 DoorC Stay)
                                    +. sko_strategy_0 DoorC Swap = 1)
                                && ((((500000 *. sko_strategy_0 DoorA Stay
                                       +. 600000 *. sko_strategy_0 DoorB Stay)
                                      +. 800000 *. sko_strategy_0 DoorC Stay)
                                     +. 1400000 *. sko_strategy_0 DoorA Swap)
                                    +. 1300000 *. sko_strategy_0 DoorB Swap)
                                   +. 1100000 *. sko_strategy_0 DoorC Swap = 1400000
                                expansions
                                []
                                rewrite_steps
                                  forward_chaining
                                  • Sat (Some let strategy x_0 x_1 = match (x_0, x_1) with | (DoorA, Swap) -> 1. | _ -> 0. )

                                  More generally, this illustrates how we can use Imandra’s ability to reason about functional values and real arithmetic to formally prove and refute statements about probability distributions. These abilities allow us to us to formally analyse many kinds of scenarios requiring the ability to reason about random variables.