work in progress | array

Class ARRAY

Jim McKim's Alternative recursive specification for ARRAY


The layered specification sets are intended to give readers a map of these dependencies. In v42 the basics are `lower', `upper', and `item'. We could define every command and every (other) query strictly in terms of these and the dependency graph would be very simple indeed. But we give up checkability, at least until we allow some changes in the language to express quantifiers. Introducing "higher" level specifiers like `is_equal' and `subarray' increases the complexity of dependencies (bad) but increases the checkability (good). So like everything else in software, we're talking tradeoffs.

Still, if we can keep the number of specifiers small then we're keeping the number of dependencies small. We should only introduce a new specifier, and hence new dependencies, if we really think we've gained something by it,  usually either greater readability or greater checkability.

  Ideally, we want a design where as many features as possible are specified formally, and where there are no cycles in the specifications.

I'd add checkability, as much as we can get.

  But certain features cannot be specified formally within an Eiffel assertion. For example, 'lower' depends on the arguments to 'make' and the history of calls to 'resize' and 'force'. So there seems to be no way to make it formal. Are these features the "basic specifiers" in your terminology?

Yes, and they _are_ specified formally in v42, just not via their own postconditions. One of the obligations of the designer is to  specify the exact effect (assuming no nondeterminism) of every _command_  on every _basic_ specifier. If this is done, then given the values of  those queries now, we know both:


1. The current values of all the _other_ queries, as they're computed from the basics.

2. The precise effect the next command will have on the basics.

So in v42, it should be possible to deduce the exact value of `lower', `upper' and `item' after `make', `put', `resize', and every other command. Because of the nature of the beast, usually `lower', and `upper' are specified directly, while the values of `item' are specified indirectly, usually via `subarray' or `is_equal'.

Whenever we build a new array it's sufficient to specify the values of the basic specifiers. This is true regardless of whether we build the new array inside the ARRAY class itself (think subarray) or whether some client class needs to build an Array.

 
 I believe I can come up with a version where the basic specifiers are

lower, upper, last, minus_last

where `minus_last' is the subarray from lower..upper - 1. There _might_ be a tad less recursion needed.... and it would resolve the compatibility issue of `subarray' in v42 versus current vendors' use of the term.

The drawback is that `minus_last' is not exactly a typical ARRAY feature and so the _only_ reason for including it in the kernel would be for specification purposes.

Hey, why do things the hard way?

Ummm, it's my curse? :-)

  Instead of 'last' and 'minus_last', why not use 'first' and 'rest', where 'rest' is the subarray from lower+1..upper

You could even use the names 'head' and 'tail', which are rather readable, or you could use the names 'car' and 'cdr', which are rather readable (to Lisp programmers :-).

I'm glad you didn't pick the last pair. In theory, and probably in practice, it doesn't matter which end we base the spec from, so this is fine with me.

  There is no doubt at all that we can specify arrays in terms of this set of features. This is well-established within the field of functional programming languages!

Yup.

> 
> Here are some examples. We start with the four basic specifiers, for
> which the specification is partially or wholly informal:
> 
>    lower: INTEGER
>       -- Lowest valid index

C'est bon.

> 
>    upper: INTEGER
>       -- Highest valid index
>       ensure
>          result >= lower - 1

I'd drop the postcondition.

 a. It's not necessary as the value of `upper' will be taken care of by
the postconditions of the commands.
 b. It introduces an apparent dependency on `lower'.
 c. It's arbitrary in the sense that there's no corresponding postcondition
on `lower'.

> 
>    head: G
>       -- First element
>       require
>          upper >= lower

C'est bon.

> 
>    tail: ARRAY[G]
>       -- New array excluding 'head'
>       require
>          count > 0

I'd use  upper >= lower  again. Introducing `count' here adds a new dependency. I now have to understand `count' before I can fully understand `tail'.

>       ensure
>          result.lower = lower + 1
>          result.upper = upper

These don't really do any harm, but I'd consider dropping them for reason (a) above.

Ok, so far we have basic specifiers:

lower, upper, head, and tail

  Now we do the functions. See how simple most of these are compared to v42, especially 'has', 'all_default' and 'is_equal'!

:-). They're really only simpler in syntax, of course. But that's nothing to be sneezed at! I'm awful at keeping track of parentheses. This version makes that task a whole lot easier. Hey people, are you willing to accept `tail' as part of the kernel ARRAY class? If not, Roger and I can go play by ourselves.

[.. valid_index, item, count, is_empty, has  all nicely done.]

Roger did use `valid_index' as a precondition on `item', so we have our first level 1 specifier. Is it necessary? No, we could just use its definition and keep the dependencies down.

[..]

> 
>    index_of(v: like item): INTEGER
>       ensure
>          count = 0 implies result = upper + 1
>          count > 0 and head = v implies result = lower
>          count > 0 and head /= v implies result = tail.index_of(v)
> 

    index_of(v: like item): INTEGER
       ensure
          upper < lower implies result = upper + 1
          lower <= upper and then head = v implies result = lower
          lower <= upper and then head /= v implies result = tail.index_of(v)

I don't think the use of `count' gains us enough to justify the extra dependency. And then note the need for "and then" :-).

>    last: G
>       require
>          count > 0
>       ensure
>          result = item(upper)

Again, I'd require  upper >= lower.

> 
>    occurrences(v: G): INTEGER
>       ensure
>          count = 0 implies result = 0
>          count > 1 and head = v implies result = 1 + tail.occurrences(v)
>          count > 1 and head /= v implies result = tail.occurrences(v)

Similar comments to `index_of'.

> 
>    all_default: BOOLEAN
>       ensure
>          result = (count = 0 or else
>           (head = void or else head = head.default)
>           and tail.all_default)

Good.

> 
>    is_equal(other: like current): BOOLEAN
>       require -- from GENERAL
>          other_not_void: other /= void
>       ensure -- from GENERAL
>          consistent: standard_is_equal(other) implies result
>          same_type: result implies same_type(other)
>          symmetric: result implies other.is_equal(current)
>       ensure then
>          lower = other.lower
>           and (count = 0 or else head = other.head and tail =
> other.tail)

A few little things here. Still need a bunch of parentheses I'm afraid.

ensure then
  Result = (lower = other.lower and then upper = other.upper and then
       (upper < lower or else (head = other.head and tail.is_equal(other.tail))))

> 
>    reverse_index_of(v: G): INTEGER
>       ensure
>          count = 0 implies result = lower - 1
>          count > 0 and last = v implies result = upper
>          count > 0 and last /= v implies result = minus_last.index_of(v)

Hmm.. This adds two new specifiers, `last' and `minus_last', and accompanying  dependencies. Also, these new specifiers probably aren't going to be very useful outside of this feature. How about:

    reverse_index_of(v: G): INTEGER
       ensure
          upper < lower implies result = lower - 1
          (lower <= upper and then head /= v and then 
             tail.reverse_index_of(v) = lower) implies Result = lower - 1
          lower <= upper and then head = v implies 
             Result = tail.reverse_index_of(v)

No new dependencies, but perhaps a little too complex, so maybe:

    reverse_index_of(v: G): INTEGER
       ensure
          not has(v) implies Result = lower - 1
          has( v ) implies Result = tail.reverse_index_of(v)

If you like this then we have one new level one specifier: `has'.

  OK, so I had to use 'minus_last' after all for that one, but not as a basic specifier. It can be specified like this:

`minus_last' is probably still a reasonable query to provide; if nothing else it provides symmetry with `tail'.

> 
>    minus_last: ARRAY[G]
>       -- New array excluding 'last'
>       require
>          count > 0
>       ensure
>          result.lower = lower
>          result.upper = upper - 1
>          count >= 1 implies result.head = head and result.tail =
> tail.minus_last

Maybe this instead:

    minus_last: ARRAY[G]
       -- New array excluding 'last'
       require
          lower <= upper
       ensure
          result.lower = lower
          result.upper = upper - 1
          Result.head = head
          lower < upper implies Result.tail.is_equal( tail.minus_last )

`is_equal' becomes a level one specifier. We'd need it in commands anyway.

  So far, so good. Now to the commands. Unfortunately, some of these would get messy unless we specify 'subarray'. But we don't need to make it a basic specifier:

I agree on both counts.

> 
>    subarray(min, max: INTEGER): ARRAY [G]
>       require
>          lower <= min
>          max <= upper
>          min <= max + 1
>       ensure
>          result.lower = 1
>          result.upper = max - min + 1

OK so far.

 max > min implies result.head = item(min)

I think   max >= min implies Result.head = item( min ).  This makes `item' a level one specifier and `subarray' will therefore be at level two, but perhaps that's inevitable. You were right that you need  max > min  for the next clause, of course.

 and result.tail.is_equal(subarray(min + 1, max))

You know, I really thought it was gonna be harder than this, and I'm still not _quite_ convinced this recursion is properly anchored, but it looks pretty darn good. Lemme think about it some more to be sure.

  The specification bug (of 'subarray') that I described in version 42 is no longer present, because we are now depending upon 'tail' instead of 'subarray' to step through the recursive specification - and tail is a basic specifier.

Well, there wasn't any bug before, as I hope I've convinced you,  but I do agree this is simpler. In fact if `subarray' checks out, it's a whole lot better. Of course the question remains as to whether users will accept `tail' as part of the kernel Array.

  It turns out that no changes are needed to the specifications for any of the commands, and the V42 postconditions work just fine.

I'll double check this, but I'd expect it to be the case.

Best,
-- Jim