Haskell: data structure for storing ascending integers with very fast lookup
(This question is related to my previous question , or rather my answer .)
I want to store all cubes of natural numbers in a structure and search for specific integers to see if they are perfect cubes.
For instance,
cubes = map (\x -> x*x*x) [1..] is_cube n = n == (head $ dropWhile (<n) cubes)
This is much faster than calculating the cube root, but has complexity O(n^(1/3))
(am I right?).
I think using a more complex data structure would be better.
For example, in C, I could store the length of an already generated array (not a list - for faster indexing) and do a binary search. This will be O(log n)
with a lower rate than the other answer to this question. The problem is, I can't express this in Haskell (and I don't think I should).
Or I can use a hash function (for example mod
). But I think that much more memory will have multiple lists (or a list of lists) and it will not reduce the complexity of the search (yet O(n^(1/3))
), but only a factor.
I was thinking about some kind of tree, but without any smart ideas (unfortunately I never studied CS). I think the fact that all integers are increasing will make my tree poorly balanced for searching.
And I'm sure this fact about ascending integers can be a big advantage for searching, but I don't know how to use it correctly (see my first solution, which I cannot express in Haskell).
a source to share
A few comments:
-
If you have a finite number of cubes, place them in
Data.IntSet
. Search - logarithmic time. The algorithm is based on Patricia's trees and on paper by Gill and Okasaki. -
If you have infinitely many cubes in a sorted list, you can do a binary search. start at index 1 and you'll double it logarithmically many times until you get something big enough, and then logarithmically many more steps to find your integer or eliminate it. Unfortunately, in lists, each search is proportional to the size of the index. And you cannot create an infinite constant-search array.
In this context, I suggest the following data structure:
A sorted list of sorted arrays of cubes. The array at position
i
contains elementsexp(2,i)
.
You have a slightly more complex form of binary search. I am not awake enough to do the analysis from my head, but I believe it will lead you to an O ((log n) ^ 2) worst case.
a source to share
You can do a fibonacci search (or whatever you like) over a lazy infinite tree:
data Tree a = Empty
| Leaf a
| Node a (Tree a) (Tree a)
rollout Empty = []
rollout (Leaf a) = [a]
rollout (Node x a b) = rollout a ++ x : rollout b
cubes = backbone 1 2 where
backbone a b = Node (b*b*b) (sub a b) (backbone (b+1) (a+b))
sub a b | (a+1) == b = Leaf (a*a*a)
sub a b | a == b = Empty
sub a b = subBackbone a (a+1) b
subBackbone a b c | b >= c = sub a c
subBackbone a b c = Node (b*b*b) (sub a b) (subBackbone (b+1) (a+b) c)
is_cube n = go cubes where
go Empty = False
go (Leaf x) = (x == n)
go (Node x a b) = case (compare n x) of
EQ -> True
LT -> go a
GT -> go b
a source to share