Description of Benchmark:
In this task we want to test the ability of counting. The goal is to predict all the next possible symbols of a string given the previous symbols. The string is derived from a context sensitive language (CSL) which forms strings of the form a^n b^n c^n S (S is the “end of string symbol”). So the lexicon consists of 4 symbols each encoded in 4 bits with only one nonzero bit. For this task one training set of 1000 strings will be generated where each string is derived from the proposed grammar and where n is chosen in the interval [0, 30]. The performance will be evaluated on 4 test sets of 1000 strings each, generated from the proposed grammar with n chosen in the intervals [0, 30], [31, 80], [81, 400] and [401, 1000] respectively.
We will use the percentage correct predicted strings per set as an error metric. Where a string is correctly predicted if for each symbol of the string the target is able to predict all possible next symbols and none of the others.