Theory Evaluation
theory Evaluation
imports Setup
begin
ML ‹
Isabelle_System.make_directory (File.tmp_path (Path.basic "examples"))
›
section ‹Evaluation \label{sec:evaluation}›
text ‹
Recalling \secref{sec:principle}, code generation turns a system of
equations into a program with the \emph{same} equational semantics.
As a consequence, this program can be used as a \emph{rewrite
engine} for terms: rewriting a term \<^term>‹t› using a program to a
term \<^term>‹t'› yields the theorems \<^prop>‹t ≡ t'›. This
application of code generation in the following is referred to as
\emph{evaluation}.
›
subsection ‹Evaluation techniques›
text ‹
There is a rich palette of evaluation
techniques, each comprising different aspects:
\begin{description}
\item[Expressiveness.] Depending on the extent to which symbolic
computation is possible, the class of terms which can be evaluated
can be bigger or smaller.
\item[Efficiency.] The more machine-near the technique, the
faster it is.
\item[Trustability.] Techniques which a huge (and also probably
more configurable infrastructure) are more fragile and less
trustable.
\end{description}
›
subsubsection ‹The simplifier (‹simp›)›
text ‹
The simplest way for evaluation is just using the simplifier with
the original code equations of the underlying program. This gives
fully symbolic evaluation and highest trustablity, with the usual
performance of the simplifier. Note that for operations on abstract
datatypes (cf.~\secref{sec:invariant}), the original theorems as
given by the users are used, not the modified ones.
›
subsubsection ‹Normalization by evaluation (‹nbe›)›
text ‹
Normalization by evaluation \<^cite>‹"Aehlig-Haftmann-Nipkow:2008:nbe"›
provides a comparably fast partially symbolic evaluation which
permits also normalization of functions and uninterpreted symbols;
the stack of code to be trusted is considerable.
›
subsubsection ‹Evaluation in ML (‹code›)›
text ‹
Considerable performance can be achieved using evaluation in ML, at the cost
of being restricted to ground results and a layered stack of code to
be trusted, including a user's specific code generator setup.
Evaluation is carried out in a target language \emph{Eval} which
inherits from \emph{SML} but for convenience uses parts of the
Isabelle runtime environment. Hence soundness depends crucially
on the correctness of the code generator setup; this is one of the
reasons why you should not use adaptation (see \secref{sec:adaptation})
frivolously.
›
subsection ‹Dynamic evaluation›
text ‹
Dynamic evaluation takes the code generator configuration \qt{as it
is} at the point where evaluation is issued and computes
a corresponding result. Best example is the
@{command_def value} command for ad-hoc evaluation of
terms:
›
value %quote "42 / (12 :: rat)"
text ‹
\noindent @{command value} tries first to evaluate using ML, falling
back to normalization by evaluation if this fails.
A particular technique may be specified in square brackets, e.g.
›
value %quote [nbe] "42 / (12 :: rat)"
text ‹
To employ dynamic evaluation in documents, there is also
a ‹value› antiquotation with the same evaluation techniques
as @{command value}.
›
subsubsection ‹Term reconstruction in ML›
text ‹
Results from evaluation in ML must be
turned into Isabelle's internal term representation again. Since
that setup is highly configurable, it is never assumed to be trustable.
Hence evaluation in ML provides no full term reconstruction
but distinguishes the following kinds:
\begin{description}
\item[Plain evaluation.] A term is normalized using the vanilla
term reconstruction from ML to Isabelle; this is a pragmatic
approach for applications which do not need trustability.
\item[Property conversion.] Evaluates propositions; since these
are monomorphic, the term reconstruction is fixed once and for all
and therefore trustable -- in the sense that only the regular
code generator setup has to be trusted, without relying
on term reconstruction from ML to Isabelle.
\end{description}
\noindent The different degree of trustability is also manifest in the
types of the corresponding ML functions: plain evaluation
operates on uncertified terms, whereas property conversion
operates on certified terms.
›
subsubsection ‹The partiality principle \label{sec:partiality_principle}›
text ‹
During evaluation exceptions indicating a pattern
match failure or a non-implemented function are treated specially:
as sketched in \secref{sec:partiality}, such
exceptions can be interpreted as partiality. For plain evaluation,
the result hence is optional; property conversion falls back
to reflexivity in such cases.
›
subsubsection ‹Schematic overview›
text ‹
\newcommand{\ttsize}{\fontsize{5.8pt}{8pt}\selectfont}
\fontsize{9pt}{12pt}\selectfont
\begin{tabular}{l||c|c|c}
& ‹simp› & ‹nbe› & ‹code› \tabularnewline \hline \hline
interactive evaluation & @{command value} ‹[simp]› & @{command value} ‹[nbe]› & @{command value} ‹[code]› \tabularnewline
plain evaluation & & & \ttsize\<^ML>‹Code_Evaluation.dynamic_value› \tabularnewline \hline
evaluation method & @{method code_simp} & @{method normalization} & @{method eval} \tabularnewline
property conversion & & & \ttsize\<^ML>‹Code_Runtime.dynamic_holds_conv› \tabularnewline \hline
conversion & \ttsize\<^ML>‹Code_Simp.dynamic_conv› & \ttsize\<^ML>‹Nbe.dynamic_conv›
\end{tabular}
›
subsection ‹Static evaluation›
text ‹
When implementing proof procedures using evaluation,
in most cases the code generator setup is appropriate
\qt{as it was} when the proof procedure was written in ML,
not an arbitrary later potentially modified setup. This is
called static evaluation.
›
subsubsection ‹Static evaluation using ‹simp› and ‹nbe››
text ‹
For ‹simp› and ‹nbe› static evaluation can be achieved using
\<^ML>‹Code_Simp.static_conv› and \<^ML>‹Nbe.static_conv›.
Note that \<^ML>‹Nbe.static_conv› by its very nature
requires an invocation of the ML compiler for every call,
which can produce significant overhead.
›
subsubsection ‹Intimate connection between logic and system runtime›
text ‹
Static evaluation for ‹eval› operates differently --
the required generated code is inserted directly into an ML
block using antiquotations. The idea is that generated
code performing static evaluation (called a ∗‹computation›)
is compiled once and for all such that later calls do not
require any invocation of the code generator or the ML
compiler at all. This topic deserves a dedicated chapter
\secref{sec:computations}.
›
end