The Search
algorithm finds a subsequence in an array identical
to a given sequence. Its signature can be defined as:
function Search (A : T_Arr; B : T_Arr) return Option;
Search
will return an option containing the smallest valid
index J
of A
such that A (J .. J + B'Length - 1) = B
if it
exists.
We will first specify and implement Search
without using ghost
functions: we will see in the next sections that using ghost
functions to factorize specification as usual is not trivial in
this case (this is mainly due to limitations of SMT solvers).
A specification of Search
can be written as follows:
function Search
(A : T_Arr;
B : T_Arr)
return Option with
Pre => A'Last < Positive'Last,
Contract_Cases => (B'Length = 0 => not Search'Result.Exists,
A'Length < B'Length => not Search'Result.Exists,
A'Length >= B'Length and then B'Length /= 0
and then
(for some J in A'First .. A'Last - B'Length + 1 =>
A (J .. J - 1 + B'Length) = B) =>
Search'Result.Exists
and then
A (Search'Result.Value .. Search'Result.Value - 1 + B'Length) = B
and then
(if Search'Result.Value > A'First then
(for all K in A'First .. Search'Result.Value - 1 =>
A (K .. K - 1 + B'Length) /= B)),
others => not Search'Result.Exists);
- the precondition expresses
- the fact that the last index of
A
is less than the lastPositive
value, as we will useA'Last + 1
in the implementation. - the fact that
B'First
is less thanB'Last
. This precondition is necessary, otherwise the contract cases can neither be proved nor their disjointedness
- the fact that the last index of
- the postconditions are expressed through 4 contract cases:
- if
B
is empty, then there is no index returned - if
A
is shorter thanB
, then there is again no index returned - if
A
is longer thanB
and there is a subsequence inA
equal toB
then- a valid index will be returned
- the subsequence of
A
starting at this index is equal toB
- if the returned index is not the first index of
A
, then for all indexesK
ofA
less than the returned index, the subsequence of lengthB'Length
starting atK
is not equal toB
- otherwise there is no index returned
- if
function Search
(A : T_Arr;
B : T_Arr)
return Option
is
begin
if (A'Length < B'Length or else B'Length = 0) then
return (Exists => False);
end if;
for J in A'First .. A'Last + 1 - B'Length loop
if A (J .. J - 1 + B'Length) = B then
return (Exists => True, Value => J);
end if;
pragma Loop_Invariant
(for all K in A'First .. J => A (K .. K - 1 + B'Length) /= B);
end loop;
return (Exists => False);
end Search;
- the implementation is (again) classic: we range over
A
indexes trying to find a subsequence equal toB
- the loop invariants specify that the subsequence has not been
found starting from the previously tried indexes and that the
Option
result does not contain a valid index
When using GNATprove
on Search
, all the assertions are proved
except the contract case, even with a timeout of 1800s on a
powerful machine. Notice that the counterexample returned in the
case where B
is a subarray of A
is wrong:
search_wo_ghost_p.ads:17:42: medium: contract case might fail
(e.g. when A = (1 => -1, others => 0) and
A'First = 1 and A'Last = 5 and
B = (-1 => 0, -3 => 0, -5 => 0, 0 => 0, 1 => 0, 2 => -1, others => -1) and
B'First = 1 and B'Last = 3 and
K = 2147483646 and
Search'Result = (Exists => False, Value => 2147483647))
This may happen when solvers are stopped during their search because of a timeout for instance, see the article Counterexamples from Proof Failures in SPARK available here.
Why are SMT solvers unable to prove the contract cases? In fact, they do not play very well with nested quantifiers, particularly when there are existential quantifiers. Here, the use of equality over arrays implies universal quantification (two arrays are equal if and only if for the elements of each array at the same offset are equal).
In the following, we will try to use ghost functions to “hide” the quantifiers to the SMT solvers.
We will first define a predicate to represent the main property,
i.e., the fact that B
is a subsequence of A
. We will see in
the following that using this ghost function does not allow to
prove Search
with the SMT solvers used in the project.
We will define the ghost function Has_Subrange
to represent
the fact that B
is a subsequence of A
:
function Has_Subrange
(A : T_Arr;
B : T_Arr)
return Boolean is
(for some J in A'First .. A'Last + 1 - B'Length =>
A (J .. J - 1 + B'Length) = B) with
Pre => A'Length >= B'Length and then A'Last < Positive'Last,
Ghost;
We equip the function with preconditions specifying that A
is
longer than B
and that the last index of A
is not the last
Positive
value. The function is defined by an expression
directly specifying the expected property: there is a slice of
A
that is equal to B
.
Notice that the function is equivalent to the quantified
expression previously used in the specification of Search
.
The specification of Search
using Has_Subrange
is now:
function Search
(A : T_Arr;
B : T_Arr)
return Option with
Pre => A'Last < Positive'Last and then B'First <= B'Last,
Contract_Cases => (B'Length = 0 => not Search'Result.Exists,
A'Length < B'Length => not Search'Result.Exists,
A'Length >= B'Length and then B'Length /= 0
and then Has_Subrange (A, B) =>
Search'Result.Exists
and then
A (Search'Result.Value .. Search'Result.Value - 1 + B'Length) = B
and then
(if Search'Result.Value > A'First then
(not Has_Subrange
(A (A'First .. Search'Result.Value + B'Length - 2), B))),
others => not Search'Result.Exists);
The implementation of Search
is the same as previous excepting
the use of Has_Subrange
:
function Search
(A : T_Arr;
B : T_Arr)
return Option
is
begin
if (A'Length < B'Length or else B'Length = 0) then
return (Exists => False);
end if;
for J in A'First .. A'Last + 1 - B'Length loop
if A (J .. J - 1 + B'Length) = B then
return (Exists => True, Value => J);
end if;
pragma Loop_Invariant
((not Has_Subrange (A (A'First .. J + B'Length - 1), B)));
end loop;
return (Exists => False);
end Search;
Using this specification and this implementation, GNATprove
cannot prove contract cases. This is not surprising, as the ghost
function does nothing more than the previous specification and
implementation.
An usual trick to help SMT solvers is to hide the unnecessary
quantifiers in auxiliary subprograms (this has been suggested by
Yannick Moy of AdaCore). A previous version of Search
redefines
the Has_Subrange
function by using two intermediate functions
allowing to hide the quantifiers: one function to express that a
subrange of A
is equal to B
starting from a particular index
J
and another one to express the fact that a subrange of A
is
equal to B
starting from a index less than a given index. This
permit to “split” the different use of quantifiers (one function
to bound the subrange with a lower index and the other one with an
upper index). See the GPL2017
branch of Spark by Example
repository for this implementation.
Claire Dross from AdaCore suggested that the provers have
difficulties with the slices and particularly the expression J -
1 + B'Length
that is used to qualify the last index of a slice of
A
starting from J
and with a length equal to B
length. Encapsulating this expression in a function should help
the solvers, as they have a new identifier to work with.
The Last
function is simply defined as follows:
function Last
(I : Positive;
B : T_Arr)
return Natural is (I - 1 + B'Length) with
Pre => I - 1 <= Positive'Last - B'Length;
Has_Subrange
is now defined using the Last
function:
function Has_Subrange
(A : T_Arr;
B : T_Arr)
return Boolean is
(for some J in A'First .. A'Last - B'Length + 1 =>
A (J .. Last (J, B)) = B) with
Pre => A'Length > 0 and then A'Length >= B'Length
and then A'Last < Positive'Last;
The specification of Search
is now the following:
function Search
(A : T_Arr;
B : T_Arr)
return Option with
Pre => A'Last < Positive'Last,
Contract_Cases =>
(A'Length >= B'Length and then B'Length /= 0
and then Has_Subrange (A, B) =>
Search'Result.Exists
and then A (Search'Result.Value .. Last (Search'Result.Value, B)) = B
and then
(if Search'Result.Value > A'First then
(for all J in A'First .. Search'Result.Value - 1 =>
A (J .. Last (J, B)) /= B)),
others => not Search'Result.Exists);
The previously defined functions are now used in the
specification. Notice that we use slices without any
problem. Several contract cases are also gathered in the others
default case.
The implementation of Search
is the following:
function Search
(A : T_Arr;
B : T_Arr)
return Option
is
begin
if (A'Length < B'Length or else B'Length = 0) then
return (Exists => False);
end if;
for J in A'First .. A'Last + 1 - B'Length loop
if A (J .. J - 1 + B'Length) = B then
return (Exists => True, Value => J);
end if;
pragma Loop_Invariant
(for all K in A'First .. J => A (K .. Last (K, B)) /= B);
end loop;
return (Exists => False);
end Search;
Everything is now proved by GNATprove
at level 1 with a timeout
of 5s. The previous specification and implementation available in
the GPL2017
branch needed a timeout of 30s. Notice that SPARK
Discovery 2017 with CVC 1.5 and Z3 4.6.0 can also prove the
function, but with a slightly number of steps for the contract
cases. Provers are getting better and better!