Implement Recursive CTE for Hierarchical Tree Structure in SQL Server

Implement Recursive CTE for Hierarchical Tree Structure in SQL Server

Tree or Hierarchical structure is maintained when we have to represent the dataset in a tree format, meaning each node has a parent node above it, except for the root node which doesn't have any parent node above it. Here each leaf node has a parent node to it except the root node.

Let's Demonstrate with the code below

create table folders(
    id int identity(1,1) primary key,
    name varchar(100),
    parent_id int,
    active bit not null default(1)
);

insert into folders(name, parent_id) values ('Documents', null), ('Music', null), 
('folder a', 1), ('folder b', 1), ('folder c', 1), ('file in c', 5), ('folder b-a', 4),  ('folder b-b', 4), ('file in b-a one', 7), ('file in b-a two', 7), ('file in b-a', 7), ('file in b-b', 8), -- 12
('file in folder a', 3), ('folder a-a', 3), ('folder a-b', 3), ('file in a-a', 14), ('file in a-b', 15);

-- for musics
insert into folders (name, parent_id) values ('folder p', 2), ('folder q', 2), ('file one', 2), ('file in p one', 18), ('file in p two', 18), ('file in p three', 18), ('file in p four', 18), ('file in p five', 18), ('file in q one', 19)

select * from folders; --gets all records

To map the row with its parent row, we have used the parent_id column of the same table. The parent_id column will contain the id of the parent_row. If the parent_id column values are null, then it denotes that the particular row is the root node.

To get the full path from the root parent row to the current row, we can use the Recursive CTE.

with folder_tree (id, name, parent_id, level, path) as 
(
    select 
        id,
        name,
        parent_id,
        0,
        cast('/' + name + '/' as nvarchar(max))
    from folders 
    where parent_id is null and active = 1 -- the anchor query. 

    union all

    select 
        f.id,
        f.name,
        f.parent_id,
        ft.level + 1,
        ft.path + f. name + '/'
    from folders f
    join folder_tree ft
        on 1 = 1
    where f.parent_id = ft.id 
)

select * from folder_tree;

To generate paths for the record of the folders table, we created it before. We have used the folder_tree CTE above. In this CTE, let's first identify root rows in the table. As mentioned earlier the rows which have null or 0 as the value in parent_id can be considered root nodes. So the query in the anchor tag will get the root rows of the folders table. This is also called an anchor query as it's a starting part for our recursive CTE.

After the anchor query, we want to see each row in the folder table with its data and the path from the root row (folder) to that folder. The path should contain every folder in the path. In the second part inside the CTE query, we did the same, to generate the path, we cross join folders table with folder_tree CTE as we wanted every folder and its path, and in the where clause we checked on folders table parent_id column with folder_tree id column. Now, in the outer query, we will get the folder data and their respective full paths.

folder_tree cte query output.PNG