A task that most web developers have faced at one point or another is how to cleanly manage updating data in a one-to-many relationship in a relational database. For example, let's say that you have schema looks something like this:
The User table holds the information on all your Users, the UserRole table holds all the roles of each User and the Role table contains a list of valid roles that a user can hold. This type of schema gives us a ton of flexibility, but management of the UserRole table can often be tricky. The most common method I've seen developers use to maintain a table like this is to delete all the data in the UserRole table for a user and use a loop to do an insert on each role that needs to be assigned. This leads to SQL that looks something like:
delete from UserRole where UserId = 1 insert into UserRole ( UserId, RoleId ) values ( 1, 1 ) insert into UserRole ( UserId, RoleId ) values ( 1, 2 ) insert into UserRole ( UserId, RoleId ) values ( 1, 3 )
While this works, it has several issues:
Wouldn't it be nicer if you could just supply a list of the RoleIds you want to update instead of doing a bunch of individual INSERT INTO statements?
Well there is actually a very nice way to manage this that should work in any modern database. The trick is to use an INSERT INTO with SELECT statement to manage your data. The basic idea is instead of doing a bunch of individual INSERT INTO statements, we can write a single SQL statement that looks like this:
insert into UserRole ( UserId, RoleId ) select 1 as UserId , RoleId from Role where RoleId in (1,2,3)
What's nice about this code is work's around our first two issues by reducing the SQL statement and ensures that only valid Roles will actually get inserted into the the UserRole table—since we're selecting the valid rows right from the Role table.
But how do we solve issue #3, so that we don't alter roles that haven't changed? The trick is to only delete the roles not in our new list of values, then modify our insert to only insert rows not already assigned to us. So our complete code now looks like this:
delete from UserRole where UserId = 1 -- IF WE HAVE NO ROLES TO ASSIGN, DO NOT RUN ANY CODE BELOW THIS LINE -- this deletes only rows not in our new selection and RoleId not in (1,2,3) insert into UserRole ( UserId, RoleId ) select 1 as UserId , RoleId from Role where RoleId in (1,2,3) -- only grab roles not already inserted into the database and RoleId not in ( select RoleId from UserRole where UserId = 1 )
This new code gives us 2 simple SQL statements (that can be run in a single transaction) which will now remove any roles that are no longer assigned to a user and only insert the new rows that need to be added. We're not altering data that hasn't changed.
Coming shortly, I'll post a follow up article that's specific to SQL Server 2005 (and above) on how you can modify the above example to return a query that will contains an audit of all the changes applied to the database.