Theory Evaluation

theory Evaluation
imports Setup
begin (*<*)

ML ‹
  Isabelle_System.make_directory (File.tmp_path (Path.basic "examples"))
› (*>*)

section ‹Evaluation \label{sec:evaluation}›

text ‹
  Recalling \secref{sec:principle}, code generation turns a system of
  equations into a program with the \emph{same} equational semantics.
  As a consequence, this program can be used as a \emph{rewrite
  engine} for terms: rewriting a term \<^term>‹t› using a program to a
  term \<^term>‹t'› yields the theorems \<^prop>‹t ≡ t'›.  This
  application of code generation in the following is referred to as
  \emph{evaluation}.
›


subsection ‹Evaluation techniques›

text ‹
  There is a rich palette of evaluation
  techniques, each comprising different aspects:

  \begin{description}

    \item[Expressiveness.]  Depending on the extent to which symbolic
      computation is possible, the class of terms which can be evaluated
      can be bigger or smaller.

    \item[Efficiency.]  The more machine-near the technique, the
      faster it is.

    \item[Trustability.]  Techniques which a huge (and also probably
      more configurable infrastructure) are more fragile and less
      trustable.

  \end{description}
›


subsubsection ‹The simplifier (‹simp›)›

text ‹
  The simplest way for evaluation is just using the simplifier with
  the original code equations of the underlying program.  This gives
  fully symbolic evaluation and highest trustablity, with the usual
  performance of the simplifier.  Note that for operations on abstract
  datatypes (cf.~\secref{sec:invariant}), the original theorems as
  given by the users are used, not the modified ones.
›


subsubsection ‹Normalization by evaluation (‹nbe›)›

text ‹
  Normalization by evaluation \<^cite>‹"Aehlig-Haftmann-Nipkow:2008:nbe"›
  provides a comparably fast partially symbolic evaluation which
  permits also normalization of functions and uninterpreted symbols;
  the stack of code to be trusted is considerable.
›


subsubsection ‹Evaluation in ML (‹code›)›

text ‹
  Considerable performance can be achieved using evaluation in ML, at the cost
  of being restricted to ground results and a layered stack of code to
  be trusted, including a user's specific code generator setup.

  Evaluation is carried out in a target language \emph{Eval} which
  inherits from \emph{SML} but for convenience uses parts of the
  Isabelle runtime environment.  Hence soundness depends crucially
  on the correctness of the code generator setup; this is one of the
  reasons why you should not use adaptation (see \secref{sec:adaptation})
  frivolously.
›


subsection ‹Dynamic evaluation›

text ‹
  Dynamic evaluation takes the code generator configuration \qt{as it
  is} at the point where evaluation is issued and computes
  a corresponding result.  Best example is the
  @{command_def value} command for ad-hoc evaluation of
  terms:
›

value %quote "42 / (12 :: rat)"

text ‹
  \noindent @{command value} tries first to evaluate using ML, falling
  back to normalization by evaluation if this fails.

  A particular technique may be specified in square brackets, e.g.
›

value %quote [nbe] "42 / (12 :: rat)"

text ‹
  To employ dynamic evaluation in documents, there is also
  a ‹value› antiquotation with the same evaluation techniques 
  as @{command value}.
›

  
subsubsection ‹Term reconstruction in ML›

text ‹
  Results from evaluation in ML must be
  turned into Isabelle's internal term representation again.  Since
  that setup is highly configurable, it is never assumed to be trustable. 
  Hence evaluation in ML provides no full term reconstruction
  but distinguishes the following kinds:

  \begin{description}

    \item[Plain evaluation.]  A term is normalized using the vanilla
      term reconstruction from ML to Isabelle; this is a pragmatic
      approach for applications which do not need trustability.

    \item[Property conversion.]  Evaluates propositions; since these
      are monomorphic, the term reconstruction is fixed once and for all
      and therefore trustable -- in the sense that only the regular
      code generator setup has to be trusted, without relying
      on term reconstruction from ML to Isabelle.

  \end{description}

  \noindent The different degree of trustability is also manifest in the
  types of the corresponding ML functions: plain evaluation
  operates on uncertified terms, whereas property conversion
  operates on certified terms.
›


subsubsection ‹The partiality principle \label{sec:partiality_principle}›

text ‹
  During evaluation exceptions indicating a pattern
  match failure or a non-implemented function are treated specially:
  as sketched in \secref{sec:partiality}, such
  exceptions can be interpreted as partiality.  For plain evaluation,
  the result hence is optional; property conversion falls back
  to reflexivity in such cases.
›
  

subsubsection ‹Schematic overview›

text ‹
  \newcommand{\ttsize}{\fontsize{5.8pt}{8pt}\selectfont}
  \fontsize{9pt}{12pt}\selectfont
  \begin{tabular}{l||c|c|c}
    & ‹simp› & ‹nbe› & ‹code› \tabularnewline \hline \hline
    interactive evaluation & @{command value} ‹[simp]› & @{command value} ‹[nbe]› & @{command value} ‹[code]› \tabularnewline
    plain evaluation & & & \ttsize\<^ML>‹Code_Evaluation.dynamic_value› \tabularnewline \hline
    evaluation method & @{method code_simp} & @{method normalization} & @{method eval} \tabularnewline
    property conversion & & & \ttsize\<^ML>‹Code_Runtime.dynamic_holds_conv› \tabularnewline \hline
    conversion & \ttsize\<^ML>‹Code_Simp.dynamic_conv› & \ttsize\<^ML>‹Nbe.dynamic_conv›
  \end{tabular}
›

  
subsection ‹Static evaluation›
  
text ‹
  When implementing proof procedures using evaluation,
  in most cases the code generator setup is appropriate
  \qt{as it was} when the proof procedure was written in ML,
  not an arbitrary later potentially modified setup.  This is
  called static evaluation.
›


subsubsection ‹Static evaluation using ‹simp› and ‹nbe››
  
text ‹
  For ‹simp› and ‹nbe› static evaluation can be achieved using 
  \<^ML>‹Code_Simp.static_conv› and \<^ML>‹Nbe.static_conv›.
  Note that \<^ML>‹Nbe.static_conv› by its very nature
  requires an invocation of the ML compiler for every call,
  which can produce significant overhead.
›


subsubsection ‹Intimate connection between logic and system runtime›

text ‹
  Static evaluation for ‹eval› operates differently --
  the required generated code is inserted directly into an ML
  block using antiquotations.  The idea is that generated
  code performing static evaluation (called a ∗‹computation›)
  is compiled once and for all such that later calls do not
  require any invocation of the code generator or the ML
  compiler at all.  This topic deserves a dedicated chapter
  \secref{sec:computations}.
›
  
end