Fold-up: a source transformation program would be useful for debugging and understanding programs

Consider this long, complicated function I took from PostgreSQL. /* from includes */ #define LP_USED 0x01 /* this line pointer is being used */ #define LP_DELETE 0x02 /* item is to be deleted */ #define OverwritePageMode 0x10 /* from .c */ OffsetNumber PageAddItem(Page page, Item item, Size size, OffsetNumber offsetNumber, ItemIdFlags flags) { PageHeader phdr = (PageHeader) page; Size alignedSize; int lower; int upper; ItemId itemId; OffsetNumber limit; bool needshuffle = false; bool overwritemode = (flags & OverwritePageMode) != 0; flags &= ~OverwritePageMode; /* * Be wary about corrupted page pointers */ if (phdr->pd_lower < SizeOfPageHeaderData || phdr->pd_lower > phdr->pd_upper || phdr->pd_upper > phdr->pd_special || phdr->pd_special > BLCKSZ) elog(PANIC, "PageAddItem: corrupted page pointers: lower = %u, upper = %u, special = %u", phdr->pd_lower, phdr->pd_upper, phdr->pd_special); /* * Select offsetNumber to place the new item at */ limit = OffsetNumberNext(PageGetMaxOffsetNumber(page)); /* was offsetNumber passed in? */ if (OffsetNumberIsValid(offsetNumber)) { /* yes, check it */ if (overwritemode) { if (offsetNumber < limit) { itemId = PageGetItemId(phdr, offsetNumber); if (((*itemId).lp_flags & LP_USED) || ((*itemId).lp_len != 0)) { elog(WARNING, "PageAddItem: tried overwrite of used ItemId"); return InvalidOffsetNumber; } } } else { if (offsetNumber < limit) needshuffle = true; /* need to move existing linp's */ } } else { /* offsetNumber was not passed in, so find a free slot */ /* look for "recyclable" (unused & deallocated) ItemId */ for (offsetNumber = 1; offsetNumber < limit; offsetNumber++) { itemId = PageGetItemId(phdr, offsetNumber); if ((((*itemId).lp_flags & LP_USED) == 0) && ((*itemId).lp_len == 0)) break; } /* if no free slot, we'll put it at limit (1st open slot) */ } if (offsetNumber > limit) { elog(WARNING, "PageAddItem: specified offset after maxoff"); return InvalidOffsetNumber; } /* * Compute new lower and upper pointers for page, see if it'll fit. * * Note: do arithmetic as signed ints, to avoid mistakes if, say, * alignedSize > pd_upper. */ if (offsetNumber == limit || needshuffle) lower = phdr->pd_lower + sizeof(ItemIdData); else lower = phdr->pd_lower; alignedSize = MAXALIGN(size); upper = (int) phdr->pd_upper - (int) alignedSize; if (lower > upper) return InvalidOffsetNumber; /* * OK to insert the item. First, shuffle the existing pointers if * needed. */ itemId = PageGetItemId(phdr, offsetNumber); if (needshuffle) memmove(itemId + 1, itemId, (limit - offsetNumber) * sizeof(ItemIdData)); /* set the item pointer */ (*itemId).lp_off = upper; (*itemId).lp_len = size; (*itemId).lp_flags = flags; /* copy the item's data onto the page */ memcpy((char *) page + upper, item, size); /* adjust page header */ phdr->pd_lower = (LocationIndex) lower; phdr->pd_upper = (LocationIndex) upper; return offsetNumber; } ( Portions Copyright (c) 1996-2002, PostgreSQL Global Development Group; Portions Copyright (c) 1994, Regents of the University of California )

Now, this function can be run in a few different ways (eg: with or without "overwrite mode" - whatever that is). When a programmer is confronted with a function like this, she probably tries to understand it by looking at the main cases. I propose a program, foldup to help by allowing the programmer to specify a case she is interested in, and removing code that is not relevant to that case. In the above example, the foldup might be invoked like this:

foldup bufpage.c --assert '(flags & OverwritePageMode) == 0' \ --assert 'PageAddItem() != InvalidOffsetNumber' This would "simplify" the source code as much as possible by applying to assumptions: that overwrite mode isn't being used, and calls to the function succeed (i.e. don't fail with error code "InvalidOffsetNumber").

And the output might appear like this:

OffsetNumber PageAddItem(Page page, Item item, Size size, OffsetNumber offsetNumber, ItemIdFlags flags) { PageHeader phdr = (PageHeader) page; Size alignedSize; int lower; int upper; ItemId itemId; OffsetNumber limit; bool needshuffle = false; /* * Be wary about corrupted page pointers */ if (phdr->pd_lower < SizeOfPageHeaderData || phdr->pd_lower > phdr->pd_upper || phdr->pd_upper > phdr->pd_special || phdr->pd_special > BLCKSZ) elog(PANIC, "PageAddItem: corrupted page pointers: lower = %u, upper = %u, special = %u", phdr->pd_lower, phdr->pd_upper, phdr->pd_special); /* * Select offsetNumber to place the new item at */ limit = OffsetNumberNext(PageGetMaxOffsetNumber(page)); /* was offsetNumber passed in? */ if (OffsetNumberIsValid(offsetNumber)) { if (offsetNumber < limit) needshuffle = true; /* need to move existing linp's */ } else { /* offsetNumber was not passed in, so find a free slot */ /* look for "recyclable" (unused & deallocated) ItemId */ for (offsetNumber = 1; offsetNumber < limit; offsetNumber++) { itemId = PageGetItemId(phdr, offsetNumber); if ((((*itemId).lp_flags & LP_USED) == 0) && ((*itemId).lp_len == 0)) break; } /* if no free slot, we'll put it at limit (1st open slot) */ } /* * Compute new lower and upper pointers for page, see if it'll fit. * * Note: do arithmetic as signed ints, to avoid mistakes if, say, * alignedSize > pd_upper. */ if (offsetNumber == limit || needshuffle) lower = phdr->pd_lower + sizeof(ItemIdData); else lower = phdr->pd_lower; alignedSize = MAXALIGN(size); upper = (int) phdr->pd_upper - (int) alignedSize; /* * OK to insert the item. First, shuffle the existing pointers if * needed. */ itemId = PageGetItemId(phdr, offsetNumber); if (needshuffle) memmove(itemId + 1, itemId, (limit - offsetNumber) * sizeof(ItemIdData)); /* set the item pointer */ (*itemId).lp_off = upper; (*itemId).lp_len = size; (*itemId).lp_flags = flags; /* copy the item's data onto the page */ memcpy((char *) page + upper, item, size); /* adjust page header */ phdr->pd_lower = (LocationIndex) lower; phdr->pd_upper = (LocationIndex) upper; return offsetNumber; } I reckon this later version is a bit easier to understand now. The next assertion I'd apply might be "offsetNumber == 10".

Some thoughts on this idea:

It appears that this is equivalent to The Formal Transformation Approach to Source Code Analysis and Manipulation, by M. P. Ward. The idea is a generalization of slicing, which is discussed towards the end of the paper.


Back to my ideas page