[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: straw proposal: heuristic info from procedures



   Date: Mon, 06 May 1996 14:31:28 +0000
   From: William D Clinger <will@ccs.neu.edu>

   What follows is a straw proposal for extracting heuristic information
   from procedures and their environments for the benefit of debuggers
   and other uses that should not be allowed to interfere with compiler
   optimizations or the requirements of embedded systems.

   I would call attention to the following:

     1.  We need a document that is separate from the R*RS series
	 to describe this and other interfaces for things that are
	 likely to be fairly implementation-dependent in their
	 behavior.

     2.  Most implementations are already capable of supplying most
	 of this information, although the details vary, so this is
	 essentially a standard interface to capabilities that already
	 exist.

SLIB is designed for this purpose.  Your first set of procedures
(Extracting heuristic information from procedures) offers a solution
to the problem of generating useful, uniform error messages for the
dozen implementations supporting SLIB -- something currently lacking.

The second set of procedures (Extracting heuristic information from
environments) perhaps offers a uniform platform from which PSD
(Portable Scheme Debugger) could operate with much better performance,
and perhaps without the necessity of instrumenting each procedure.

     3.  A null implementation satisfies the specification below.

     4.  Programmers can request a non-null implementation by some
	 conventional means that is orthogonal to this proposal.

   I like Aubrey Jaffer's suggestion that documentation strings be
   used to mark specific procedures for which a non-null implementation
   of this stuff should be provided, but I think a more general
   declaration facility is needed also.  I think it is useful to
   separate the interface itself from the means used to declare that
   this kind of information is needed because:

     *  There will be implementations for which this kind of
	information can always be provided.

     *  There will also be implementations for embedded systems
	in which this kind of information is never needed.  Consider,
	for example, an implementation of Scheme that is used only as
	an UNCOL for a multi-language compiler.  Debugging information
	should be made available in terms of the source program and
	its variables, not in terms of the compiled (that is, Scheme)
	code.

     *  It is easier to reach agreement on separate proposals than
	on proposals that try to solve several problems at once.

Of course this also depends on what develops from item 1.

   Extracting heuristic information from procedures.
 ....

   (procedure-name _proc_)

     Returns a name of _proc_ as a symbol or string, or returns
     #F if no name is associated with _proc_.

It might be useful to return lists of names for internal procedures;
This way shadowing names of top-level defined procedures might cause
less confusion:

(define (foo a)
  (define car (lambda (b) (+ a 3)))
  car)

(procedure-name (foo 'ffo))	==>    (foo car)

   (procedure-source-file _proc_)

     Returns the name of the file that contains the source code for
     _proc_, as a string, or returns #F if this information is not
     available.

   (procedure-source-position _proc_)

     Returns an exact integer that specifies the number of characters
     that precede the opening parenthesis of the source code for _proc_
     within the source file returned by PROCEDURE-SOURCE-FILE, or
     returns #F if this information is not available.

In my example above, would the source-position of FOO and (FOO 3) be
the same?

The only (typical) error-message information missing from your
proposal is position in terms of line-number of the source-file.

I realize that count of characters is cleaner from a theoretical
standpoint, but practical difficulties arise:

* Under MS-DOS, files can be opened in either text or binary mode.
  For text files, the number of characters read differs at each line
  break.  On text files, the number of lines does not differ
  depending on mode.

* Tools like emacs and PSD use line-number.

   (procedure-code _proc_)

     Returns the source code for _proc_ as a lambda expression, in the
     traditional representation as a list, or returns #F if no source
     code is available for _proc_.

Would `procedure-expression' be a more precise name?