My neighbor obsesses over rare candies. He'll travel to small towns with 5 and Dimes to buy the the wax lips and Mallo Cups he remembers growing up with. White people famously obsess over accumulating pirated music. I think it represents modern man's search for identity in a post-industrial society where we now have the luxury to contemplate more than just feeding our families and surviving the winter. It's intrinsically a play activity normally abandoned in young adulthood, but the affluence of our society has afforded us to become a culture of overgrown man-childs still tinkering with childhood past-times. So what do you obsessively accumulate? Oh, you're above it all, I understand. You're content with adult, intellectual pursuits... like spending Summer plowing through an endless list of fiction and memoirs (an accumulation of vague insights and good party conversation?). Perhaps you have a spiritual escape from it, and have melded a Christian upbringing with Kabbalah/ Scientology/ Pastafarianism) into a compassionate and pious life. Just be careful to not fall into Spirtual Materialism, Chögyam Trungpa's phrase describing the accumulation of spiritual detritus that simply serves to build up the ego.If this sounds negative then you've gotten the wrong idea... I LOVE accumulating crap. Coins, comics, and crummy implementations of toy algorithms. Collecting useless junk is my birthright as a child of the modern age. My favorite: The Sieve of Eratosthenes. There's so much joy in retreading a useless relic from the past. Without further ado, here goes!
Maybe you remember Eratosthenes, the 3rd Century Greek who gave us the first accurate estimate of the Earth's circumference in the 3rd century bc. Perhaps not. He also gave us a method for finding prime numbers which, up until the 1970s, was the best way to find them. The algorithm from Wikipedia is a good starting point. They have you start with a finite list of numbers from 2 (2..100, say). The first element of the list is always prime, so 2 is prime. Now remove all multiples of 2 from the list, leaving only odd numbers. Again, the first element of the remaining list is always prime, so 3 is prime. Now remove all multiples of 3 from the list. First element (5) is prime, remove multiples of 5 from list, etc. etc. Implementing this algorithm will leave you with a list where the is always a prime number and the list is filtered each time the head is taken.This method certainly works, and is the basis for Chris Smith's F# Sieve of Eratosthenes. Personally, I like mine better because all the data is immutable, there is no state, and it uses only built-in F# types (but kudos to Chris for getting the ball rolling):
Reading through this quickly: "primes" is a function that returns all the prime numbers under the specified maximum. It is implemented as an inner, recursive function called "next", which is initailized with a list of all natural numbers from 2 to the maximum. "next" takes the list it is given as a parameter and a) returns an empty list when given an empty list, or b) concatenates the head of the list (remember, the head is always prime) with the result of a recursive call to next passing in the tail of the list with multiples of the head value filtered out. Writing this was every bit as satisfying as wax lips on a summer day. Mmm.
let primes max =
let rec next potentialPrimes =
match potentialPrimes with
|  -> 
| n :: tail -> n :: next (List.filter (fun x -> x % n <> 0) tail)
next [ 2 .. max]
printfn "%A" (primes 1000000)
So why not end it here?The problem with all this is that the Wikipedia algorithm starts with a finite list of numbers. What if you wanted to calculate all the prime numbers until you ran out of resource? What would you initialize the "next" function with? Long.MAX_VALUE? That isn't high enough if you have dedicated, purpose built hardware (which they had in the 1970s in order to run this exact algorithm). Infinity? The first step of this algorithm is to create an in-memory list of all numbers from 2 to max... if you set max to infinity then you'd run out of resources building the initial list before the first prime was even pulled. Ouch. Hopefully that bug is caught before it makes it to production.
A much cooler approach is to use infinite streams. And F# provides the perfect starting point: Seq. An infinite stream is a data structure that looks and behaves like a List type, but produces elements on demand using a function. The Seq.init_sequence function creates a seq instance that produces elements from the offset requests. So all natural numbers can be expressed lazily and infinitely using an identity function like so:
So what we need is a sieve based on lazy evaluation, bigint and Seq types. I humbly submit the following:
// creates a Sequence of all numbers
let allNumbers = Seq.init_infinite (fun i -> i)
let sieve max =
// creates a Sequence based on a "multiple of" function
let multiplesOf n = Seq.init_infinite (fun i ->
bigint.FromInt32(i) * n)
// checks if a given infinite bigint
// sequence contains a given bigint
let seqContains x seq =
let rec innerSeqContains count =
match x with
| _ when Seq.nth count seq = x -> true
| _ when Seq.nth count seq > x -> false
| _ -> innerSeqContains (count+1)
// checks if a list of infinite bigint
// sequences contains a given bigint
let rec seqsContain(x, seqs:seq<bigint> list) =
seqContains x s) seqs
// generates a list of prime numbers, accumulating all
// non-prime numbers as a list of infinite sequences
let rec innerSieve (count : bigint) nonPrimes =
match count with
| _ when count = max -> 
| _ when count = 1I -> 1I :: innerSieve (count+1I) nonPrimes
| _ when seqsContain (count, nonPrimes) ->
innerSieve (count+1I) nonPrimes
| _ ->
let nextSeq = multiplesOf count
let newNonPrimeList = nextSeq :: nonPrimes
count :: innerSieve (count+1I) newNonPrimeList
innerSieve 2I 
printfn "%A" (sieve 500I)
...multiplesOf is a helper function to create a Seq representing all multiples of a given bigint.
...seqContains determines if a given seq contains said bigint. The builtin contains can't be used without an infinite loop.
...seqsContain determines if any element of a seq of seq of bigint contains said bigint
...innerSieve simply produces the integers (this could in itself be expressed as a seq)
So what's up the initializing this thing to 500? Hey, I don't have specialized hardware, how do you expect me to test this thing? Anything over a couple thousand primes and my machine starts to crawl.
The sieve is a cool trick, but since the 70s it has been superseded by replaced as a prime finder by probabilistic techniques.