Site Network: Personal | Professional | Photography

Technical Blog

This blog will contain content related to Java, Seam, Security, my sites and projects, as well as other technical subjects I am interested in.

Comments and questions are welcome!

Oracle PL/SQL Loop to Update Fields with Periodic Commits

Have you ever had one of those moments where you've just completed importing 1.7 million user records from flat files into an normalized set of ATG Personalization tables, plus some custom tables, and you only then realize that the passwords from the old system, while in the compatible md5sum format, are all UPPERCASE, while ATG uses all lowercase hashes?

It's not a great moment.

Initially I was stuck in an ATG specific mind-set. I figured the two possible solutions were as follows:

  1. Use the already sub-classed ProfileFormHandler to override the out of the box password comparison and do an uppercase on the hash value before checking it against the database. Also update the create and update related methods as well to uppercase the hash.
  2. Drop the 1.7 million records. Update the importer tool to lowercase the incoming md5 hash before persisting it.

The issue with option 1 is that it means overriding the perfectly good out of the box behavior of ATG Profiles for the life of the application due to a set of awkward initial data. I'm happy to override ATG behavior when I should, but this just felt wrong.

The issue with option 2 is that it would take approx 2-3 days. The import process was slow and had to be done in small chunks for reasons I won't get into now.

It wasn't until after sleeping on it that I realized I had put myself into the ATG box unnecessarily. Let's just fix the data in situ. I didn't know much PL/SQL but some Googling (I love how that's a common verb now, btw) led me to write up this:

 
declare i number := 0;
begin
  FOR r IN (SELECT id FROM dps_user)
  loop
    UPDATE dps_user
       SET password = lower(password)
     WHERE id = r.id;
  	i := i+1;
	IF mod(i, 10000) = 0 THEN    -- Commit every 10000 records
	      COMMIT;
	end IF;
  end loop;
commit;
end;
 

I'm no Oracle DBA, but it works!

Essentially it loops through every row in the dps_user table, and replaces the password with an all-lowercase version of itself. Every 10,000 records it does a commit as the full 1.7 million records would overwhelm the undo tablespace.

Enjoy!

4 Responses to “Oracle PL/SQL Loop to Update Fields with Periodic Commits”

  1. peter rurenga Says:

    You got very lucky that you didn’t run into exception ‘ora-01555 snapshot too old’.
    Best would be to update in one statement, second best to do it in a loop that is restartable.
    The code above is not restartable, this will always update all records.
    My advice would be:

    begin
    loop
    update dps_user
    set password = lower(password)
    where password != lower(password)
    and rowcount <= 10000;
    exit when sql%notfound; — this will be true after last records have been updated
    commit;
    end loop
    end;

    Might anything crash during execution, than you happily submit this code again and it will continue from wherever it crashed.

    Cheers

  2. Devon Says:

    Peter:

    Thanks, although I’m sort of confused (not being very good with PL/SQL that shouldn’t be a surprise), the snapshot too old error seems to be related to running out of undo space, isn’t it? If I’m committing every 10k records, why would I run into that?

    Also you suggest doing it one statement, but wouldn’t I definitely run out of undo space given the large size of the records?

    Thanks!

    Devon

  3. peter rurenga Says:

    Hi Devon,

    When an update starts, Oracle ‘queries’ what data it will update. This is the starting point and needs to be consistent untill the update has finished. This is often referred to as read consistent view of the data.
    Now, what is the use of that?
    Say, for example, that another user would start inserting data while you’re still busy updating, then this user wouldn’t be happy to find that his new inserted data would be included in the update statement.
    So, Oracle needs to keep track of the original state of the data as it was when the update started.

    Now, how does commiting affect this read consistency?
    While updating, undo information is saved. After a commit, this undo information is no longer protected.
    This doesn’t have to be a problem, becaust Oracle can still access the information in the undo segment to create a read consistent view of the data.
    But, the moment a another transaction reuses this undo space, the undo is overwritten by other undo and there is no longer read consistency: you’ll get the ‘ORA-01555: snapshot too old’ error.

    Be aware that it doesn’t need to be another user to reuse undo: it might be yourself updating the next set.
    Large and many undo segments will make it less likely to hit that error, but are no garantee.
    You can have lots of undo space, but still run into this error, because after a commit a rollback segment will not grow but reuse the freed up space.
    If your next transaction happens to use the same rollback segment, you’re hit the ORA-01555

    To do it in one large update, how big should your rollback segment be?
    That is hard to predict how much you need, I would suggest to just go for it.
    If it fails, no damage is done. You can enlarge your undo tablespace and/or rollback segment and retry it.
    If it doesn’t, fail, you’re home free.

    I tried this update in my own test environment:
    update dps_user set password = lower(password);
    It took about 5 minutes, updated 3.7 million records and used 560 MB (70K undo blocks * 8K (db_block_size)).

    I’m happy enough with that.

    Cheers,

    Peter

  4. Devon Says:

    Peter:

    That is the best comment ever! Thank you very much for taking the time to explain that, I appreciate it.

    Thanks!

    Devon

Leave a Reply