Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Loop counters #2305

Open
rakudrama opened this issue Jun 23, 2022 · 2 comments
Open

Loop counters #2305

rakudrama opened this issue Jun 23, 2022 · 2 comments
Labels
feature Proposed language feature that solves one or more problems

Comments

@rakudrama
Copy link
Member

Loop counters

TL;DR: Some loops would be easier to write and be more efficient if the language took care of some simple iteration counting tasks for us.

This issue is an expansion of #171 (comment)

Background and Motivation

It is somewhat common to iterate over a collection but also need to know the position of the current element. We will use as a running example the task of printing a play list of songs, prefixing each song with the number of the song in the list, and printing a blank line between every 5 songs.

class Song {
  String get name;
}

Iterable<Song> songs() { ... }

Example output:

1. Particle Man
2. Perfidia
3. Summer Night City
4. Lovefool
5. Gangnam Style

6. Only Happy When It Rains
7. Crazy For You

One might count the songs as they are iterated:

int index = 0;
for (final song in songs()) {
  if (index > 0 && index % 5 == 0) print('');
  print('${index + 1}. ${song.name}');
  index++;
}

This works, but there is a hazard because index and song have different scopes. There is one index variable, but a fresh song variable for each iteration of the loop. If the printing is deferred in a closure, they behave differently:

() => '${index + 1}. ${song.name}'

The scoping hazard can be avoided by copying the index into a variable in the loop scope:

int _index = 0;
for (final song in songs()) {
  final index = _index;
  if (index > 0 && index % 5 == 0) print('');
  print('${index + 1}. ${song.name}');
  _index++;
}

One might copy the songs to a temporary list:

final list = songs().toList();
for (int index = 0; index < list.length; index++) {
  final song = list[index];
  if (index > 0 && index % 5 == 0) print('');
  print('${index + 1}. ${song.name}');
}

This solves the scoping hazard, but is wasteful in copying to a list, and a indexing list feels like a lot of work. This is a good approach when the collection is already a List and there is no chance of it being changed to an Iterable, and when iterating several parallel lists.

Another approach is to use the functional style of chaining Iterables. There are some clever tricks that are useful; one trick is for converting an Iterable into (index, element) pairs is to convert the Iterable to a List and use the asMap() view of the list:

songs().toList().asMap().forEach((index, song) {
  if (index > 0 && index % 5 == 0) print('');
  print('${index + 1}. ${song.name}');
});

The trouble with this style, and .forEach in particular, is that it is not easy to tweak the iteration in the 'body' of the loop. For example, if the specification is changed to stop after printing "Goodnight Sweetheart Goodnight", we have to do some kind of filtering step that is quite difficult.

One can sometimes mix paradigms to use a for-in loop over a functional-style iterable:

for (final entry in songs().toList().asMap().entries) {
  if (entry.key > 0 && entry.key % 5 == 0) print('');
  print('${entry.key + 1}. ${entry.value.name}');
});

In general, though, the functional style can be a bit inflexible, and can be less efficient as it needs intermediate Iterables and their Iterators.

Proposal

Each loop can track the index of the current iteration. This is available as a read-only index property of the loop. The loop is identified by its keyword for:

for (final song in songs()) {
  if (for.index > 0 && for.index % 5 == 0) print('');
  print('${for.index + 1}. ${song.name}');
}

The loop is identified by the keyword, so a while loop would have a while.index property.

final iter = songs().iterator;
while (iter.moveNext()) {
  if (while.index > 0 && while.index % 5 == 0) print('');
  print('${while.index + 1}. ${iter.current.name}');
}

The keyword identifies the innermost containing loop with that keyword. Outer loops can be selected by using the label, similar to break LABEL and continue LABEL.

L:for (final song in songs()) {
  if (L.index > 0 && L.index % 5 == 0) print('');
  print('${L.index + 1}. ${song.name}');
}

Using labels might be syntactically ambiguous.

We can drop using labels in favor of either (1) ignoring the issue of outer loops and using a work-around of copying the loop property:

Outer:
for (final group in groups) {
  for (final item in group) {
    if (Outer.first) ...
  }
}

--->

for (final group in groups) {
  final outerFirst = for.first;
  for (final item in group) {
    if (outerFirst) ...
  }
}

or (2) allow a path upwards through the loops, so for.while.first would be the first property of the innermost whiler loop that encloses the innermost for loop of the expression. (This upwards-navigation feels a bit backwards to me at the moment).

Extra loop properties

It might be convenient to have other properties of a loop.

counter

While the index is useful for zero-based logic, it is sometimes convenient to have a 1-based value.

for.counter = for.index + 1

first

Some logic, like adding a separator in join, needs only to distinguish the first iteration from the rest.

for.first = for.index == 0.

String join(Iterable elements, String separator) {
  final sb = StringBuffer();
  for (final element in elements) {
    if (!for.first) sb.write(separator);
    sb.write(element);
  }
  return sb.toString();
}

The front-end of the compiler could decide to use a boolean flag if there is only a for.first property. An optimizing back-end might be able to reduce a counter with a fixed test for first to a boolean.

last

It is possible to recognize the final iteration of a for-in loop by rotating the moveNext() call earlier.

for (final song in songs()) { ... if (for.last) ... };

behaves like

final _iter = songs().iterator;
bool _going = _iter.moveNext();
while (_going) {
  final song = _iter.current;
  _going = _iter.moveNext();
  final _last = !_going;
  
  ...  if(_last) ...
}

Scope

The loop properties are loop scoped, so each iteration has a fresh property (like variables declared in for loops). The loop properties are in scope in the bodies of loops, the expression of while and do-while loops, and in the condition and increment of for-loops

Control flow collections

A enhanced collection literals have for loops, so the loop counter should be available in positions analogous to the statement version of for loops.

children: [
    Header(),
    for (final song in songs()) ...[
      if (for.index > 0 && for.index % 5 == 0) Separator(),
      Text('${for.index + 1}. ${song.name}'),
    ],
    Footer(),
  ],

TBD: should be possible to label these loops in enhanced collections for disambiguation.

General path counting

Loop path counting can be generalized. Loops have an entry and an interior and a loop count is just the number of times the interior is reached after the reference point of the entry. Any point in the program can be counted with respect to a reference point.

We might use the syntax reference.here.index for the counter for reaching the expression here subsequent to reference. The following would print a subset of songs with their original indexes, but still grouped in fives.

for (final song in songs) {
  if (isHappySong(song)) {
    if (for.here.index > 0 && for.here.index % 5 == 0) print('');
    print('${for.counter} ${song.name}');
  }
}

General path counting allows the counting of inner loops with respect to outer loops.

L:
for (final album in collection) {
  for (final song in album) {
    if (L.here.index != 0 && L.here.index % 5 == 0) print('');
    print(song.name);
  }
}

We will probably want some convenient syntax for the reference point of the beginning of the method, perhaps here. if here is not already an in-scope variable.

A special-case starting point for counting is the beginning of the program. This could be indicated by using the static keyword. The expression static.here.index could be abbreviated to static.index.

Sequential indexes no longer require a explicit variable:

class Foo {
  static int _next = 0;
  final int hashCode = _next++ & 0x3FFFFFFF;
}

class Foo {
  final int hashCode = static.index & 0x3FFFFFFF;
}

First-use initialization could be guarded with static.first.

bool _initialized = false;
...
void init() {
  if (_initialized) return;
  _initialized = true;
  ... initialization code...
}

void init() {
  if (!static.first) return;
  ... initialization code ...
}

Related work

JavaScript examples

JavaScript's Array.prototype.forEach function passes three arguments to the
'callback' function. Since JavaScript functions ignore extra arguments,
forEach can be called with a callback that has one, two to three parameters.

arr.forEach((element) => ...);
arr.forEach((element, index) => ...);
arr.forEach((element, index, array) => ...);

The two-argument form is useful in the same ways descibed above.

The three-argument form is useful for updating the array when arr is an expression that is not a variable.

Report generators

Report generators typically have ways to describe how items of the report are separated, grouped or paginated without the user needing to implement those features.

Templating languages

Angular allows the index to be bound to a variable via index as ...:

<ul>
  <li *ngFor="let contact of contacts | async; index as i;">
    Index: {{ i }}
    <contact-card [contact]="contact"></contact-card>
  </li>
</ul>

Django automatically populates several variables, e.g. forloop.counter, forloop.first, forloop.last.
https://docs.djangoproject.com/en/4.0/ref/templates/builtins/#for

Django implements forloop.parentloop to get to the enclosing loop, e.g. forloop.parentloop.counter. This might be a reasonable alternative to using a labeled statement.

General comment on temporaries

An automatic loop counter feature is similar to many small languages features that provide a convenient and readable shorthand for something that could be achieved with a temporary variable.

By avoiding the need for a temporary variable, the programmer's flow is not interrupted by the need to move to a new location to declare the variable.

Dart has many example of small features that make the language comfortable, for example e1 ?? e2.

In the discussion on #171 some suggestions are for declaring a index variable. I'm not in favor of that, since it breaks programming flow as you have to go back and declare the index.

@rakudrama rakudrama added the feature Proposed language feature that solves one or more problems label Jun 23, 2022
@TimWhiting
Copy link

I think this is better solved by records, and would make interacting with lists / maps similar.

for (final (i, value) in list.enumerate) {
  print('index $i value $value');
}
for (final (key, value) in map.enumerate) {
  print('index $i value $value');
}

@munificent
Copy link
Member

munificent commented Jun 28, 2022

I think I'm with @TimWhiting. I agree that sometimes user want an index in addition to the element with a for-in loop. Typically, I just write that like:

var elements = songs();
for (final index = 0; i < elements.length; i++) {
  final song = elements[i];
  if (index > 0 && index % 5 == 0) print('');
  print('${index + 1}. ${song.name}');
}

Now the variables index and song both have the right scope. I have full control over the index and can even mutate in the loop body to jump around. Obviously, this requires calling [] (or .element()) on the collection and works best when it's a List and not just an Iterable. But, in practice, that's almost always true for the use cases I care about. If not, it's trivial to copy it into a List before iterating over it. (In performance critical code, of course, that isn't a good idea, but then you can just increment the index manually like your before code).

I suspect that your proposal (though I do like how it looks) would be too much machinery to add to the language relative to the number of uses it would address. Even when it does come into play, I imagine whatever we added to the language would still end up being too limited and rigid. Maybe you want it to be 1-based, or you want to skip every other element, etc.

Records, destructuring, and extension methods give us a nice way to have a single loop that binds a handful of fresh variables in each iteration and uses whatever logic a user desires to define those variables:

extension on Iterable<T> {
  Iterable<(int, T)> get indexed sync* {
    var index = 0;
    for (var e in this) {
      yield (index, e);
      index++;
    }
  }

  Iterable<(int, T)> get indexed1 sync* {
    var index = 1;
    for (var e in this) {
      yield (index, e);
      index++;
    }
  }

  Iterable<(T, T)> get pairs sync* {
    T? first;
    var hasFirst = false;
    for (var e in this) {
      if (hasFirst) {
        yield (first as T, e);
      } else {
        first = e;
      }
      hasFirst = !hasFirst;
    }
  }
}

main() {
  var letters = ['a', 'b', 'c', 'd'];

  for (var (i, e) in letters.indexed) {
    print('$i: $e');
    // 0: a
    // 1: b
    // 2: c
    // 3: d
  }
  
  for (var (i, e) in letters.indexed1) {
    print('$i: $e');
    // 1: a
    // 2: b
    // 3: c
    // 4: d
  }

  for (var (e1, e2) in letters.pairs) {
    print('$e1 - $e2');
    // a - b
    // c - d
  }
}

Etc.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Proposed language feature that solves one or more problems
Projects
None yet
Development

No branches or pull requests

3 participants