Fastest immutable list data structure for lots of concatenation and a single iteration?

If order doesn't matter, just use a normal list. Prepending (consing) is O(1) and walking the whole list is O(n), which is as good as it gets for the operations you're interested in.

If order doesn't matter, just use a normal list. Prepending (consing) is O(1) and walking the whole list is O(n), which is as good as it gets for the operations you're interested in. A difference list is useful if you actually care about appending rather than prepending, because while prepending is fast for a normal list, appending is O(n).

Difference lists allow O(1) appends. Aside from ease of appending, a difference list is as slow or slower than a normal list in every case.

2 Chuck - the whole point of a DList (Hughes list) is to use it only where you are appending / building, then metamorph it into another structure - usually just a plain list - when you want to manipulate it. Maybe DList should have a runST like interface to emphasize this... – stephen tetley Dec 13 at 19:47 1 @stephentetley: Yeah, I know. That's what I was trying to convey here, though I guess I must not have done a very good job.

Hopefully your clarification does the trick. – Chuck Dec 13 at 19:51 1 One genuine problem with a DList is that a list represented as closures is likely to take up more space than one represented with constructors. A join list (i.e.

A binary tree with a list interface) is another candidate. – stephen tetley Dec 13 at 19:55 This sounds right for one element at a time. I clarified the question indicating that I'm concatenating list, not just consing a single element.

– taotree Dec 130 at 2:59.

This is an empirical question and should be answered empirically. Reasonable alternatives include Standard list with cons (called "prepend" in your question) Difference list (John Hughes list) with constant-time append Algebraic data type supporting constant-time append: data Alist a = ANil | ASingle a | AAppend (Alist a) (Alist a) List of lists with final concat. All these will take linear time.

But constant factors matter, and the only way to find out is to build and measure. If you want you can create a microbenchmark that is completely faithful to your original code, but performs only list operations, by logging every list operation into a writer monad. But that is probably a huge pain in the ass and just not worth it.

Instead, write a simple benchmark, compile (with optimization turned on), and measure. And please let us know the results.

– Will Ness Dec 14 at 17:29 @Will: Yes it will be a tree, but the cost will be linear in the number of nodes, not N log N. All these representations, including the difference list and the list of lists, give you some kind of constant-time append, and then later (if you have to) you convert to a standard list using a single, linear-time pass.(Repeated append on a standard functional list is quadratic in the size of the final list.) – Norman Ramsey Dec 15 at 3:47 How do you have a linear-time pass over a tree? I don't think that you do.

AAppend is obviously linear, but I don't see how a later traversal over it can be, if it was fully built, without some fusion taking place. Same would be with diff lists in Haskell which are just function composition of cons-sections. I'm sure GHC does some special rearrangements of internal cons-frames without which DL too would be linearithmic in the worst case: ((a:).(b:)).(c:) is still a tree unless (.) does some magic here.

I think. – Will Ness Dec 15 at 8:30 @Will: visiting every element of a tree is always linear time. You may be thinking of required stack space, which in a balanced tree is logarithmic.

My advice: whip up code for a tree and do the measurements yourself; you'll see easily that traversal is O(n). If you want an O(n log n) thing to compare with, try insertion into a red-black tree. You can build a binary-search tree of n elements in O(n log n) and emit the (sorted) elements in O(n) time.

Unlike the search tree, an append tree is both built and traversed in linear time. – Norman Ramsey Dec 18 at 4:32 Hmm, you're most probably right, here. Provided that the traversal algo uses the stack of visited nodes, yes (which would be of O(n) size in the worst case of left-leaning degenerated tree, and so would also take O(n) additional time).

So it's not automatically so, one must build the stack of visited nodes to prevent pointer-chasing back up.Cool. Thanks! – Will Ness Dec 19 at 18:48.

If you can append elements one by one, a plain list is ok. If you can only append chunks, then a list of lists is better, because adding new chunk becomes an O(1) instead of O(N) where N is the chunk size. Two factors help list of lists to be fast: Laziness List fusion Both will work only if you produce the list of lists by a good producer and consume it by a single good consumer.So if your producer and consumer are good and you consume the list in a single-threaded way, then GHC will generate just loops and no intermediate lists at all because of list fusion.

Two different implementations of list fusion exist: so called build/foldr and stream fusion. See also haskell.org/haskellwiki/Correctness_of_s... If producer and consumer are good but list fusion doesn't engage (because you didn't use optimization flags, because particular fusion optimization is not supported by GHC or if you use a compiler other than GHC without fusion support) you will still get reasonable performance because of laziness.In this case intermediate lists will be produced but immediately collected by garbage collector.

By my reasoning, it's just two heads, two tails and a cons, all of which are O(1) operations. – Chuck Dec 13 at 19:12 Yes, that's what I mean.It's basically tail (head lst) : tail lst (I mistakenly said both two heads and two tails, but it should have been one head and two tails). All of those operations are O(1) unless I'm gravely mistaken.

– Chuck Dec 13 at 19:40 List fusion is very interesting. Considering in this case, the consumer is calling into an API that wraps the return value in an existential so it's just a Foldable... I'm not sure if it would be able to do that at compile time. Though, if switching to using a list instead of a Foldable would take advantage of list fusion and give us the performance we want, that would probably be worth it.

– taotree Dec 15 at 3:08 You should implement both and run benchmarks using criterion tool. There are many possible representations (e.g. A list of arrays), so you should try some exotic cases too. – nponeccop Dec 15 at 13:18 What if I provide an instance of Foldable a?

Would that be just as fast as list fusion? That wouldn't create intermediate lists, would it? – taotree Dec 15 at 16:02.

If by append you mean "add a single element to the end of the list", and you implement that by xs ++ x, then yes that's horribly slow for huge lists because each ++ is O(n), making the total O(n^2). In that case, you can speed this up simply by using cons to add an element to the front of the list instead of the end. That makes the whole process of building the list O(n).

Then you can use reverse to reverse it, which is also O(n), but you only have to do it once, so you're still O(n). If your processing either isn't affected by the order or can be done in reverse order with slight modifications, you can elide the reverse anyway. And in that case you can also exploit laziness to only build the elements as you process them, meaning you don't need the whole list in memory, which could potentially speed up your code a bit as well depending on the memory behaviour of your code; if each list element fits in the CPU cache you may get a large speed up this way.

If by append you mean "concatenate a list onto the end of another list", you can do the same thing by using some sort of "reverse prepend" operation, where you cons elements from the new list onto the front of the target list one element at a time; this gives you list concatenation that is linear in the size of each new list rather than the list you're building up, so it's O(n) overall in the total number of elements you process, rather than O(n^2). Alternatively you could build up a list of lists in reverse order using cons, then process that with some sort of reverse-flatten operation, which should also be O(n). It's still harder to see how to avoid the reversing completely in this case (multi-element append), unless your final processing is completely order-independent.

Of course, if your need for high performance goes beyond just avoiding super-linear operations, then you may have to look at different data structures altogether than list.

I apparently didn't make it sufficiently clear that the resulting order doesn't matter. Your final sentence is why I posted this question. Is there some data structure that is especially fast for this type of thing?

– taotree Dec 15 at 3:04 1 @taotree If the order doesn't matter, then just use cons to build up a list of lists, and then process one list-at-a-time from that, processing one-element-at-a-time from each list. That means lazy evaluation will interleave the generation of the list-of-lists with its consumption if they're independent, and the whole process is O(n) in the total number of elements. Building the list-of-lists will be O(n) in the number of sublists, which will be better than linear (but in a way that may or may not be significant).

– Ben Dec 15 at 4:02 1 @taotree I'm not aware of any container data structure that can be built and iterated faster than O(n) in the number of elements. Maybe Data. Vector and family could get you a bit of speedup by not doing lots of small allocations, at the cost of doing one enormous allocation.

If you can exploit laziness to avoid having the whole thing in memory at once, I wouldn't be surprised if the simple list version is actually faster. The main advantage of Vectors is loop fusion, as I understand it, which lets several passes composed together be fused into a single loop. You won't benefit from that.

– Ben Dec 15 at 4:09.

Consider a list of lists, if segments are of different length. And concat. Lazyness should cope with it.

– Daniel Wagner Dec 13 at 21:13 1 @DanielWagner: I think he's suggesting that the list won't actually be created until it's needed by concat, so the compiler will be able to optimize it into a single series of conses instead of building N lists and then N-1 copies. I think it is an optimization the compiler could perform in some cases, but I'm not sure whether any compiler actually does this optimization. – Chuck Dec 13 at 22:58 See my answer for explanation.

Basically, either nothing but loops will be created (because of list fusion) or single cons cells of all intermediate lists (because of laziness). – nponeccop Dec 14 at 9:10.

Be sure to traverse/consume it by a function with bang patterns, perhaps through foldl'. Use {-# LANGUAGE BangPatterns #-}. And of course {-# OPTIONS_GHC -O2 #-} but that's a given (no way to know whether you knew that already.

:) ). And what do you mean by "then" in "The result will then be iterated through once. "?

There is no "then" with laziness, normally it's all getting interspersed by a system. You don't eat your salad after making a whole bowl of it in Haskell, you eat each piece as you chop them. That way the list won't grow at all.

For an example of a code that runs in constant space (meaning, consuming the data as they get produced, without building intermediate lists at all) see e.g. This (look at the bottom of the code there - it has a bidirectional flow consuming one part of data strictly, and another in a lazy manner). So, you probably need to provide more details about what do you need exactly.

I cant really gove you an answer,but what I can give you is a way to a solution, that is you have to find the anglde that you relate to or peaks your interest. A good paper is one that people get drawn into because it reaches them ln some way.As for me WW11 to me, I think of the holocaust and the effect it had on the survivors, their families and those who stood by and did nothing until it was too late.

Related Questions