Bitwise optimizations [of cellular automata]

The C program above stores each cell value in a separate element of an integer array. But since every value must be either 0 or 1, it can in fact be encoded by just a single bit. And since integer variables in practical computers typically involve 32 or 64 bits, the values of many cells can be packed into a single integer variable. The main point of this is that typical machine instructions operate in parallel on all the bits in such a variable. And thus for example the values of all cells represented by an integer variable a can be updated in parallel according to rule 30 by the single C statement

a = a>>1 ^ (a | a<<1);

This statement, however, will only update the specific block of cells encoded in a. Gluing together updates to a sequence of such blocks requires slightly intricate code. (It is much easier to implement in Mathematica—as discussed above—since there functions like BitXor can operate on integers of any length.) In general, bitwise optimizations require representing cellular automaton rules not by simple look-up tables but rather by Boolean expressions, which must be derived for each rule and can be quite complicated (see page 869). Applying the rules can however be made faster by using bitslicing to avoid shift operations. The idea is to store the cellular automaton configuration in, say, m variables w[i] whose bits correspond respectively to the cell values {a_{1}, a_{m+1}, a_{2m+1}, …}, {a_{2}, a_{m+2}, a_{2m+2}, …}, {a_{3}, …}, etc. This then makes the left and right neighbors of the j^{th} bit in w[i] be the j^{th} bits in w[i-1] and w[i+1]—so that for example a step of rule 30 evolution can be achieved just by w[i] = w[i-1] ^ (w[i] | w[i+1]) with no shift operations needed (except in boundary conditions on w[0] and w[m-1]). If many steps of evolution are required, it is sufficient just to pack all cell values at the beginning, and unpack them at the end.