.mt "The Au Dump Formatter" "Implementation Notes"
.au "Alan Ballard"
.fo @_

.hd 1 Use of the Dump Formatter
We seem to want to produce formatted dumps in all the following
contexts:
.al
.le
To display structures from a previously taken dump.  The dump
might be obtained within Unix or from VM.
.le
To display structures from the current system, while it continues
to operate.  This might be wanted to explore hung processes or devices
without freezing the entire system.
.le
To display structures in a frozen or dead system.
.lx
.pp
(1) or (2) can be done by incorporating formatting commands in dcon
or a separate dump printing program.
(3) would be done by incorporating them into omak.
.pp
Thus it seems to be desirable to incorporate the dump formatting
code in two programs, so it is important
to be careful about the services and interface used.
In particular, omak is to be kept self-contained, so we can't depend on
too much of the Unix world. (This would be easier if the services
which omak does provide used the same names and calling sequences
as the dcon or C-library equivalents.) Some changes in omak and/or
dcon may be required.

.hd 1 Taking a Dump
The two alternatives for getting dumps are
.dl
.le
Let VM do it.
.le
Add a small kernel service to write a core image to a Unix file.
.lx
.pp
We've implemented the second, which we think will be useful even
if the first is required also.
.pp
The kernel service is a completely
self-contained program (it does not use the file system driver).
It begins by making a private copy of page zero, so that the
dumped system image is not perturbed in significant ways by the dumping
process.  It then searches the root disk for a file
called /dump. If this
file exists it will write to it as many pages of memory as will fit.
The program will not create the file, or expand it if it is not large
enough for all memory.
.pp
Currently, a dump is taken automatically by panics, and may be
forced by entering PSW restart.
A command will be added to omak to invoke the service, allowing
the programmer to obtain an image of a problem situation for
later analysis.

.hd 1 What's Formatted
Andy Tucci's memo lists the structures required. There are some
he didn't mention which may also be useful.  The following
is a list of
system structures which might be wanted eventually.
"*" indicates the ones that
were requested by Amdahl; "+" indicates those that are currently
supported.
.al
.le
process stuff
.br
\ *+ process
.br
\ *+ user (may want to subset this, since it is big)
.br
\ *+ text
.le
file system stuff
.br
\ *+ inode
.br
\ \ + file
.br
\ \ + mount
.le
storage status
.br
\ *+ coretab
.br
\ \ \  pagetab
.le
device support
.br
\ *+ dasd
.br
\ *+ tube
.br
\ *+ tty
.br
\ *+ buf
.le
I/O system
.br
\ *+ ioq
.br
\ *+ unit
.br
\ *+ cu
.br
\ *+ chan
.le
accounting etc.
.br
.in +4
acct
.br
stats
.in
.le
global information
.br
.in +4
systm
.in
.le
others
.br
The following are other header files which I don't yet know the
meaning of.
.in +4
.br
fblk
.br
mpx
.br
mx
.br
pack
.br
pk
.br
stat
.lx
Since both _omak_ and _dcon_ include facilities for stack traces and
register displays, they have not been included in the dump program.
_dcon_ changes will be made to allow displaying these from the kernel
dump.

.hd 1 Dump Formatter Implementation Overview
This is an expansion of the original UBC description of the dump
formatter,
incorporating further details of the pieces we've written
and where they fit into the overall design.

.hd 2 Service routines
.hd 3 Memory Access
All access to the memory being dumped is by calling one of the
routines @getblk@ or @getbytes@.
Isolating memory accesses in this way allows the remainder of
the dump routines to work without change whether used by dcon (formatting
/dev/mem or /dump) or omak.
.dp
   @int getblk(memaddr, memlen)@
   @int memlen;@
   @char *memaddr;@
.ed
retrieves @memlen@ bytes of memory, starting
at @memaddr@, and returns the address of a buffer containing
them.  It returns -1 if the requested bytes are not available.
@memlen@ should be less than or  equal to 4096 currently.
The omak implementation will just return @memaddr@ unchanged.  The
dcon implementation reads the required bytes into a fixed buffer and
returns its address.  Note that the use of a single buffer means that
the result of one call to @getblk@ is destroyed by the next.
.dp
   @int getbytes(memaddr, memlen)@
   @int memlen;@
   @char *memaddr;@
.ed
retrieves a 1-to-4 byte integer from location @memaddr@ and returns
it as the function value.  It does not interfere with the results of
previous calls to @getblk@.  This function is convenient for chasing
pointers through the dump.  The macro FETCH may be used to interface
to @getbytes@, supplying the second parameter automatically.
.pp
@Note@: The current implementation needs changing to handle
error conditions when accessing non-existent locations.
.pp
The memory access routines are initialized by calling
.dp
   @memop(name)@
.ed
where _name_ is the file name to be read.
.hd 3 Symbol Table Services
Currently, we have routines to initialize the symbol table and search it.
These will be replaced with the corresponding routines from Omak and
dcon.  The following are the services that may be used:
.al
.le
@SYM lookup(sym)@
.br
@char  *sym;@
.pp
Takes a pointer to a symbol (null terminated string) and
returns a symbol table entry for the symbol.
Null is returned if the symbol is
not defined.
.le
@symlook(sym)@
.br
@char   *sym;@
.pp
Takes a pointer to a symbol (null terminate string)
and returns the corresponding address.
Returns -1 if the symbol is not defined.
.le
@SYM *addrlook(addr)@
.br
@int addr;@
.pp
Takes an address and returns a symbol-table entry for the closest
symbol whose address is less than or equal to the given address.
(The routine in dcon is called @vlook@ and is slightly different.
An interfacing routine or macro will be used in dcon.)
.lx

.hd 2 Formatting Routines
Most of the formatting conversions required are provided by the
C library routine @printf@.  However, omak can't use this.  It has its
own formatting facility, with comparable capabilities, although for
some reason there is currently no routine @printf@. We'll add a @printf@.
.pp
The omak version seems to have most of the formatting capabilities of
@printf@ that we're likely to need, including the options d, s, x, o,
field widths,
left/right justification, zero padding.
It also has a tabbing
option which might be useful for more tabular output, but is
not available in @printf@.  We could add a routine for use by
dcon to preprocess the formats and interpret this tab option. It would
probably be better to make the omak version recognize an actual
tab in the format string and expand it.
.pp
The program also provides and uses a routine @symf@:
.dp
   @symf(addr)@
   @int addr;@
.ed
which outputs the given address to stdout as an expression
of the form "symbol+displacement".

.hd 2 Table Formatting
The table formatting is handled by
a series of routines, one for each structure.
Each routine has the form
.dp
   @routine(address)@
   @char *address;@
.ed
Each routine is responsible for obtaining (via @getblk@) the memory
to be formatted, and printing it via @printf@.
.pp
The function
.dp
   @pdump(address, ptr, len, width)@
   @int address,len,width;@
   @char *ptr;@
.ed
converts the bytes pointed at by @ptr@ into a standard hex/ascii
dump format.  @len@ specifies the number of bytes to format and
@width@ is the length of output lines to be produced.  @address@
specifies the address from which the memory was obtained, and
is used to prefix the lines printed.
.pp
This routine should possibly be made more like the
other formatting routines, by obtaining the
memory to be printed itself.  (On the other hand,
maybe it should be considered a service routine in the
_Formatting Routines_ group, above. It may also be unnecessary
when the routines are incorporated in omak/dcon.)

.hd 2 System Structure Mapping
The system structure mapping component consists of routines which
interpret the parameters of the command requests, finding structures
from known addresses, or from other structures specified as part of the
command.
.pp
The major pieces of this component are
.al
.le
A table @structtable@ which contains certain information about
each of the supported structures.  This always includes the name and
the routine to be called to print it.  It
may include the name of a kernel symbol at which the structure is
located, the size of each element of an array, and a "filtering"
routine (see below).
.le
The routine @findstruct@ is the major structure-finding routine.
It distinguishes two types of structures: root structures, which
may be located by looking up symbols in the system map, and
derived structures, for which it may be necessary to find other
structures first.
.pp
This routine is invoked by the command interpreter. It has the ability
to step through all elements of a structure, and recursively call the
command interpreter for each element, or recursively call itself
to locate higher-level structures.
.le
A series of "filtering routines" which are used when processing
all entries of a table of structures, to determine whether an element
is in use or not.  When printing all elements of an array of structures,
only those that are in use are processed, although any element
may be displayed by an explicit request.
.lx
.hd 2 Command Interface
The commands provided by the current dump formatter program are
described in _dump_(3c).
.pp
Commands are parsed by YACC and LEX
programs which are intended to be embedded in the corresponding
components of both omak and dcon.
In particular, the same parsed-command structure is used, and the same
routines are used to evaluate expressions, etc.
.pp
The command interpretation contains the following major pieces:
.al
.le
Routine @printdump@ performs the @;print@ command.  It calls the routine
@findstruct@ with appropriate parameters to cause various structures to
be printed.
.le
Routine @printformat@ performs the @;format@ command, using @findstruct@
to find structures matching the options specified.
.le
Auxiliary routines @do\_proc@, @do\_active@, and @do\_dev@ are
used to interpret specific conditions determining which items are to
be printed.
.lx
