[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

defensive coding principles for BMG pipeline




michael, good urop page, and good progress so far.  some 
comments:

as you work through the existing bmg pipeline components, 
you'll see that there are many missed opportunities for 
defensive coding.

as one of your UROP goals, i'd like you to make that code
as "defensive" as possible.  it's actually not trivial to
define what that means in our case, because the notion of
correct output for every input is not well-defined (as it
would be, by contrast, for say a sorting algorithm).
but we can insist on the implementation ensuring certain
principles.  here are a few:  i urge you and the rest of
the group to think about these, add to them, write them up
and post them for comments.

first, the code should never terminate abnormally due to
illegal pointer-following or memory corruption.  that 
means it should allocate appropriately-sized strings,
avoid buffer overruns, use assert() to check for non-null
malloc returns, etc.  this is all basic stuff, but my
sense is that the current code does not have these sorts
of checks sprinkled throughout as it should.

second, the assumptions the code is making should be clearly
stated, and checked at run-time, even if that costs us extra
cycles.  so for example there are assumptions about the dxf
token stream, etc., that are built in to the current code in
an ad hoc way.  that's ok, but any deviation in the input
encountered should be reported as such, and recovered from
if possible.  there are interesting notions of "self-checking"
we might be able to apply here; for example, writing a checker
that determines whether the number of walls, spaces, etc. 
after processing roughly matches the number of source elements
that were read in as input.

third, the system should produce a visual indicator of 
errors, an indicator that forms part of the (ug, vrml, 
iv, wk etc.) output.  so for example a missing floorplan, or
a parse error, might cause a red bounding box to be produced
instead.  note that this requires some design effort to get
right; there must be a well-defined contract between the
script and each pipeline component about what to do in case
of errors, who logs the error, what the output will be, etc.

fourth, the system should produce stats in some repeatable
way so that we have some quantitative notion of progress,
as we discussed last week.

followups to the group list please.

seth.