Most efficient data structure to represent threaded comments in Java?

Up vote 4 down vote favorite 2 share g+ share fb share tw.

I want to represent threaded comments in Java. This would look similar to the way comments are threaded on reddit.com hello hello hello hello hello hello hello As in the example above, responses are nested in the HTML with appropriate indentation to reflect their relationship to prior comments. What would be an efficient way to represent this in Java?

I'm thinking some kind of tree data structure would be appropriate. But is there one in particular which would be most efficient to minimize tree traversals? This would be important if I have voting on each comment.

Because then the tree would need to be reordered after each vote - a potentially expensive operation computationally. By the way, if anyone knows of an open source existing implementation of this in Java, that would help too. Java data-structures tree reddit threaded-comments link|improve this question edited Apr 17 '09 at 6:36 asked Apr 17 '09 at 6:03Hula99110 100% accept rate.

I would use levels of linked lists. Message1 message2 message3 message4 message5 message6 message7 Each node would have a pointer to its: - forward sibling (2->5, 3->4, 5->6, 1/4/6/7->NULL). - backward sibling (4->3, 5->2, 6->5, 1/2/3/7->NULL).

- first child (1->2, 2->3, 6->7, 3/4/5/7->NULL). - parent (2->1, 3->2, 4->2, 5->1, 6->1, 7->6, 1->NULL). Within each level, messages would be sorted in the list by vote count (or whatever other score you wanted to use).

That would give you maximum flexibility for moving things around and you could move whole sub-trees (e.g. , message2) just by changing the links at the parent and that level. For example, say message6 gets a influx of votes that makes it more popular than message5. The changes are (adjusting both the next and previous sibling pointers): message2 -> message6 message6 -> message5 message5 -> NULL.

To get: message1 message2 message3 message4 message6 message7 message5 If it continues until it garners more votes than message2, the following occurs: message6 -> message2 message2 -> message5 AND the first-child pointer of message1 is set to message6 (it was message2), still relatively easy, to get: message1 message6 message7 message2 message3 message4 message5 Re-ordering only needs to occur when a score change results in a message becoming more than its upper sibling or less than its lower sibling. You don't need to re-order after every score change.

Wow! Thanks for taking the time explain this. I appreciate it.

– Hula Apr 17 '09 at 13:53.

The tree is right (with getLastSibling and getNextSibling), but if you're storing/querying the data, you probably want to store a lineage for each entry, or number by a preorder traversal: sitepoint.com/article/hierarchical-data-... For loss of the exact number of subnodes, you can leave gaps to minimise renumbering. Still, I'm not certain that this will be noticeably faster than traversing the tree each time. I guess it depends how deep your tree grows.

See also: stackoverflow.com/questions/38801/sql-ho... http://www.ibase. Ru/devinfo/DBMSTrees/sqltrees. Html (this scheme is also call a Celko tree).

Great link. Thanks. – Hula Apr 17 '09 at 6:23.

This would be important if I have voting on each comment. Because then the tree would need to be reordered after each vote - a potentially expensive operation computationally. Sounds like a premature optimization to me, possibly even a faulty optimization.

Your tree data structure sounds logical for representing your data. I say stick with it. Optimize it later only if a performance problem is detected and measured, and can be compared with alternatives.

– Hula Apr 17 '09 at 6:26 2 Maybe the quote should be: "Premature optimization is evil, but so is choosing a stupid data structure" :-) The word "stupid" is NOT in reference to anything Stu or Hula said, I'd just like to make that clear. – paxdiablo Apr 17 '09 at 6:34 1 Possibly faulty because you won't know what the most efficient data structure is for your use case until you try it out. (Plenty of folks attempt optimization only to have it be slower than simpler code.

) Until then, use the structure that results in clearly readable code, which matches your ideas. – Stu Thompson Apr 17 '09 at 7:38.

I cant really gove you an answer,but what I can give you is a way to a solution, that is you have to find the anglde that you relate to or peaks your interest. A good paper is one that people get drawn into because it reaches them ln some way.As for me WW11 to me, I think of the holocaust and the effect it had on the survivors, their families and those who stood by and did nothing until it was too late.

Related Questions