Hierarchical/tree database for directories in filesystem?

Here's a quick closure table example for SQLite. I've not included the statements for inserting items into an existing tree. Instead, I've just created the statements manually.

You can find the insert and delete statements in the Models for hierarchical data slides For the sake of my sanity when inserting the IDs for the directories, I renamed the directories to match their IDs: (ROOT) / \ Dir2 Dir3 / \ \ Dir4 Dir5 Dir6 / Dir7 Create tables CREATE TABLE `filesystem` ( `id` INTEGER, `dirname` TEXT, PRIMARY KEY (`id`) ); CREATE TABLE `tree_path` ( `ancestor` INTEGER, `descendant` INTEGER, PRIMARY KEY (`ancestor`, `descendant`) ) Insert directories into filesystem table INSERT INTO filesystem (id, dirname) VALUES (1, 'ROOT'); INSERT INTO filesystem (id, dirname) VALUES (2, 'Dir2'); INSERT INTO filesystem (id, dirname) VALUES (3, 'Dir3'); INSERT INTO filesystem (id, dirname) VALUES (4, 'Dir4'); INSERT INTO filesystem (id, dirname) VALUES (5, 'Dir5'); INSERT INTO filesystem (id, dirname) VALUES (6, 'Dir6'); INSERT INTO filesystem (id, dirname) VALUES (7, 'Dir7') Create the closure table paths INSERT INTO tree_path (ancestor, descendant) VALUES (1, 1); INSERT INTO tree_path (ancestor, descendant) VALUES (1, 2); INSERT INTO tree_path (ancestor, descendant) VALUES (1, 3); INSERT INTO tree_path (ancestor, descendant) VALUES (1, 4); INSERT INTO tree_path (ancestor, descendant) VALUES (1, 5); INSERT INTO tree_path (ancestor, descendant) VALUES (1, 6); INSERT INTO tree_path (ancestor, descendant) VALUES (1, 7); INSERT INTO tree_path (ancestor, descendant) VALUES (2, 2); INSERT INTO tree_path (ancestor, descendant) VALUES (2, 4); INSERT INTO tree_path (ancestor, descendant) VALUES (2, 5); INSERT INTO tree_path (ancestor, descendant) VALUES (2, 7); INSERT INTO tree_path (ancestor, descendant) VALUES (3, 3); INSERT INTO tree_path (ancestor, descendant) VALUES (3, 6); INSERT INTO tree_path (ancestor, descendant) VALUES (4, 4); INSERT INTO tree_path (ancestor, descendant) VALUES (4, 7); INSERT INTO tree_path (ancestor, descendant) VALUES (5, 5); INSERT INTO tree_path (ancestor, descendant) VALUES (6, 6); INSERT INTO tree_path (ancestor, descendant) VALUES (7, 7) Run some queries (ROOT) and subdirectories SELECT f. Id, f. Dirname FROM filesystem f JOIN tree_path t ON t.

Descendant = f. Id WHERE t. Ancestor = 1; +----+---------+ | id | dirname | +----+---------+ | 1 | ROOT | | 2 | Dir2 | | 3 | Dir3 | | 4 | Dir4 | | 5 | Dir5 | | 6 | Dir6 | | 7 | Dir7 | +----+---------+ # Dir3 and subdirectories SELECT f.Id, f.

Dirname FROM filesystem f JOIN tree_path t ON t. Descendant = f.Id WHERE t. Ancestor = 3; +----+---------+ | id | dirname | +----+---------+ | 3 | Dir3 | | 6 | Dir6 | +----+---------+ # Dir5 and parent directories SELECT f.

Id, f. Dirname FROM filesystem f JOIN tree_path t ON t. Ancestor = f.

Id WHERE t. Descendant = 5; +----+---------+ | id | dirname | +----+---------+ | 1 | ROOT | | 2 | Dir2 | | 5 | Dir5 | +----+---------+ # Dir7 and parent directories SELECT f. Id, f.

Dirname FROM filesystem f JOIN tree_path t ON t. Ancestor = f. Id WHERE t.

Descendant = 7; +----+---------+ | id | dirname | +----+---------+ | 1 | ROOT | | 2 | Dir2 | | 4 | Dir4 | | 7 | Dir7 | +----+---------+ # Dir7 and parent directories as a path SELECT GROUP_CONCAT(f. Dirname, '/') AS path FROM filesystem f JOIN tree_path t ON t. Ancestor = f.

Id WHERE t. Descendant = 7; +---------------------+ | path | +---------------------+ | ROOT/Dir2/Dir4/Dir7 | +---------------------+ SELECT f. Id, f.

Dirname FROM filesystem f JOIN tree_path t ON t. Ancestor = f. Id WHERE t.

Descendant = ( SELECT id FROM filesystem WHERE dirname LIKE '%7%' ); +----+---------+ | id | dirname | +----+---------+ | 1 | ROOT | | 2 | Dir2 | | 4 | Dir4 | | 7 | Dir7.

Here's a quick closure table example for SQLite. I've not included the statements for inserting items into an existing tree. Instead, I've just created the statements manually.

You can find the insert and delete statements in the Models for hierarchical data slides. For the sake of my sanity when inserting the IDs for the directories, I renamed the directories to match their IDs: (ROOT) / \ Dir2 Dir3 / \ \ Dir4 Dir5 Dir6 / Dir7 Create tables CREATE TABLE `filesystem` ( `id` INTEGER, `dirname` TEXT, PRIMARY KEY (`id`) ); CREATE TABLE `tree_path` ( `ancestor` INTEGER, `descendant` INTEGER, PRIMARY KEY (`ancestor`, `descendant`) ); Insert directories into filesystem table INSERT INTO filesystem (id, dirname) VALUES (1, 'ROOT'); INSERT INTO filesystem (id, dirname) VALUES (2, 'Dir2'); INSERT INTO filesystem (id, dirname) VALUES (3, 'Dir3'); INSERT INTO filesystem (id, dirname) VALUES (4, 'Dir4'); INSERT INTO filesystem (id, dirname) VALUES (5, 'Dir5'); INSERT INTO filesystem (id, dirname) VALUES (6, 'Dir6'); INSERT INTO filesystem (id, dirname) VALUES (7, 'Dir7'); Create the closure table paths INSERT INTO tree_path (ancestor, descendant) VALUES (1, 1); INSERT INTO tree_path (ancestor, descendant) VALUES (1, 2); INSERT INTO tree_path (ancestor, descendant) VALUES (1, 3); INSERT INTO tree_path (ancestor, descendant) VALUES (1, 4); INSERT INTO tree_path (ancestor, descendant) VALUES (1, 5); INSERT INTO tree_path (ancestor, descendant) VALUES (1, 6); INSERT INTO tree_path (ancestor, descendant) VALUES (1, 7); INSERT INTO tree_path (ancestor, descendant) VALUES (2, 2); INSERT INTO tree_path (ancestor, descendant) VALUES (2, 4); INSERT INTO tree_path (ancestor, descendant) VALUES (2, 5); INSERT INTO tree_path (ancestor, descendant) VALUES (2, 7); INSERT INTO tree_path (ancestor, descendant) VALUES (3, 3); INSERT INTO tree_path (ancestor, descendant) VALUES (3, 6); INSERT INTO tree_path (ancestor, descendant) VALUES (4, 4); INSERT INTO tree_path (ancestor, descendant) VALUES (4, 7); INSERT INTO tree_path (ancestor, descendant) VALUES (5, 5); INSERT INTO tree_path (ancestor, descendant) VALUES (6, 6); INSERT INTO tree_path (ancestor, descendant) VALUES (7, 7); Run some queries # (ROOT) and subdirectories SELECT f. Id, f.

Dirname FROM filesystem f JOIN tree_path t ON t. Descendant = f. Id WHERE t.

Ancestor = 1; +----+---------+ | id | dirname | +----+---------+ | 1 | ROOT | | 2 | Dir2 | | 3 | Dir3 | | 4 | Dir4 | | 5 | Dir5 | | 6 | Dir6 | | 7 | Dir7 | +----+---------+ # Dir3 and subdirectories SELECT f. Id, f. Dirname FROM filesystem f JOIN tree_path t ON t.

Descendant = f. Id WHERE t. Ancestor = 3; +----+---------+ | id | dirname | +----+---------+ | 3 | Dir3 | | 6 | Dir6 | +----+---------+ # Dir5 and parent directories SELECT f.Id, f.

Dirname FROM filesystem f JOIN tree_path t ON t. Ancestor = f.Id WHERE t. Descendant = 5; +----+---------+ | id | dirname | +----+---------+ | 1 | ROOT | | 2 | Dir2 | | 5 | Dir5 | +----+---------+ # Dir7 and parent directories SELECT f.

Id, f. Dirname FROM filesystem f JOIN tree_path t ON t. Ancestor = f.

Id WHERE t. Descendant = 7; +----+---------+ | id | dirname | +----+---------+ | 1 | ROOT | | 2 | Dir2 | | 4 | Dir4 | | 7 | Dir7 | +----+---------+ # Dir7 and parent directories as a path SELECT GROUP_CONCAT(f. Dirname, '/') AS path FROM filesystem f JOIN tree_path t ON t.

Ancestor = f.Id WHERE t. Descendant = 7; +---------------------+ | path | +---------------------+ | ROOT/Dir2/Dir4/Dir7 | +---------------------+ SELECT f. Id, f.

Dirname FROM filesystem f JOIN tree_path t ON t. Ancestor = f. Id WHERE t.

Descendant = ( SELECT id FROM filesystem WHERE dirname LIKE '%7%' ); +----+---------+ | id | dirname | +----+---------+ | 1 | ROOT | | 2 | Dir2 | | 4 | Dir4 | | 7 | Dir7.

JOHN: Sorry, but I've never used orientDB, so couldn't really comment. – Mike Jul 23 at 20:28 @JOHN: I'm not sure what you mean - please can you give an example? – Mike Jul 23 at 20:52 @JOHN: There's a few ways to do it.

I've added another example which simply does an additional SELECT to find the ID. You could also do a JOIN. – Mike Jul 23 at 21:06 I read the queries first and kept wondering, "How does that work?"

The I realized that your tree_path table is the adjacency list representation of the transitive closure of the directory graph, answering the questions, "Is this path a reachable descendant of this other path? " and "Is this path a reachable ancestor of this other path? " Nice.

– seh Jul 23 at 21:33 @seh: Yes, it's a really nice technique. Although it requires more entries than a simple adjacency list, the queries to select data are much simpler. I can't take the credit though.

If you want to find out more about it, check the Bill Karwin links that I posted in the comments under the question. Note that you can also add a path length, which makes finding immediate ancestors/descendants simpler. – Mike Jul 23 at 21:38.

I don't really think you want to be using SQL to do this. It is a relational DB not hierarchical. I would suggest some sort of xml or json flat file that can be parsed.

Helpful suggestion, but doesn't answer the question, so it is more of a comment. – Murali VP Jul 23 at 18:58 There must be some way to store hierarchical data to RDBMS – JOHN Jul 23 at 18:58 Yes there is, the magic to query is self joins,but what I don't know is how to get the hierarchy in a single select. – Murali VP Jul 23 at 18:59 1 @jeffminton: is there any efficient database - I mean NOT a XML file - to accomplish managing xml like trees or other hiearchical data?

– Tomas T. Jul 23 at 19:05 @Tomas Telensky : +1 for the question Tomas – JOHN Jul 23 at 19:08.

You represent hierarchical data as a series of nodes each of which has an ID and a Parent ID. You could store your in a table called DIRTAB with 2 ID columns and one for the text of the individual directory name: ID -- as a primary key PARENT_ID -- refers to the ID of the parent row in DIRTAB DIRNAME -- the text of the name eg Dir5 SQLite lacks the CONNECT BY clause that Oracle has to process hierarchical data but I think if you're prepared to accept some ugly SQL you can approximate something hierarchical: SELECT (CASE WHEN p5. DIRNAME IS NOT NULL THEN p5.

DIRNAME || '/' ELSE '' END) || (CASE WHEN p4. DIRNAME IS NOT NULL THEN p4. DIRNAME || '/' ELSE '' END) || (CASE WHEN p3.

DIRNAME IS NOT NULL THEN p3. DIRNAME || '/' ELSE '' END) || (CASE WHEN p2. DIRNAME IS NOT NULL THEN p2.

DIRNAME || '/' ELSE '' END) || (CASE WHEN p1. DIRNAME IS NOT NULL THEN p1. DIRNAME || '/' ELSE '' END) || p0.

DIRNAME as FULLPATH FROM DIRTAB p0 LEFT OUTER JOIN DIRTAB p1 ON p1. ID = p0. PARENT_ID LEFT OUTER JOIN DIRTAB p2 ON p2.ID = p1.

PARENT_ID LEFT OUTER JOIN DIRTAB p3 ON p3. ID = p2. PARENT_ID LEFT OUTER JOIN DIRTAB p4 ON p4.

ID = p3. PARENT_ID LEFT OUTER JOIN DIRTAB p5 ON p5.ID = p4. PARENT_ID WHERE p0.

DIRNAME = 'Dir6' The trouble here is that you have to anticipate the maximum depth of you directory structure and expand the SQL statement to cope. I have done 6 levels as an example. Also I'm assuming that SQLite has no problem concatenating empty strings.

(Some DBs treat them as null and convert the whole expression result to null).

Here's a quick closure table example for SQLite. I've not included the statements for inserting items into an existing tree. Instead, I've just created the statements manually.

You can find the insert and delete statements in the Models for hierarchical data slides.

I cant really gove you an answer,but what I can give you is a way to a solution, that is you have to find the anglde that you relate to or peaks your interest. A good paper is one that people get drawn into because it reaches them ln some way.As for me WW11 to me, I think of the holocaust and the effect it had on the survivors, their families and those who stood by and did nothing until it was too late.

Related Questions