Old Fields Left in Place
This is one in a series of articles on technical debt.
In a previous article, I wrote about the dangers of leaving old code (and forms and reports) in place after they have been replaced with new code or Access objects. Leaving old fields in place after a database schema change is worse for a few reasons.
- Keeping the old field may result in denormalization of the table.
- Unlike old routines, most people don't change the names of old fields.
- It's harder to identify references to table fields than to code routines.
- Field comments are less likely to be written or read than code comments.
- Schema changes are less likely to be kept under version control.
Safe schema changes
Database schemas evolve over time. Adding a field or table to capture new requirements is a straightforward change. These are the most common changes one would make to a production database. They're also the safest, because they normally do not require any changes to existing functionality.
Changes That Affect Existing Functionality
What I'm talking about in this article are changes that do affect existing functionality. As an example, imagine a table with a Yes/No field called IsActive. After several years, new requirements dictate that you need to know when the record became inactive. To support the new requirement, an InactivatedOn datetime field is added to the table. Under the new schema, a record is considered active if and only if the InactivatedOn field is null.
Old fields can lead to denormalization
In the example above, the IsActive field has a direct dependency on the new InactivatedOn field. This functional dependency violates second normal form. One big problem with functional dependencies within a record is the confusion they cause when the data is inconsistent with the dependency.
Imagine a record where the InactivatedOn field is null (which indicates the record is active), but the IsActive field is set to False. Is the record active or not? A report that relies on the InactivatedOn field will display the record as active, while a report based on the IsActive field will display the record as inactive. This is even more likely to happen if--as is likely--the IsActive field's name has not been changed.
Old fields tend to keep their original names
When one leaves old code, forms, or reports in place, one usually renames the old version. This makes it nearly impossible to refer to the old version accidentally. However, altering existing field names in production databases is a real pain. Thus, many developers choose to leave the old fields in place--original names and all--and simply ignore them.
That always seems like a reasonable plan at the time. But then, several months or years later when you return to that code, you will see the old field and wonder, "Hey, why is that there?" Even if you think it's no longer needed, you will still have to spend several hours combing through your application to double-check.
It's far better to find all the places where the field is in use when you make the change in the first place. Then, once you've updated all the references, remove the field completely as part of your schema change so that you won't have to go through the same labor-intensive process again in the future.
Field references are harder to identify than code references
Finding references to table fields without third-party tools like FMS's Total Access Analyzer or Philipp Stiefel's Access Dev Tools - Find and Replace is very difficult. It's not as simple as pressing Ctl+F in the VBIDE and entering the name of the field.
Try as you might to find every query, report, form, and control that references the field, there's still a good chance you will overlook one or more references. If you do miss a reference, you could have an object referring to old data that is no longer being maintained.
Removing the old field won't make it any more likely that you will find these overlooked references. However, it does mean that anything that relies on the old field will break quite visibly if you forget to update it. While users may be annoyed seeing error messages for missing fields, this is better than users relying on a "working" report that's actually based on incorrect data.
Field comments are more cumbersome than code comments
It's easy to add comments to code. That's no guarantee that we as developers will do it, of course, but it does make it more likely. Adding comments to table fields is less convenient. For linked tables, the comments should be added on the back-end database, otherwise they will be lost when the tables are relinked. This makes it more difficult to add comments in the first place.
Also, when working with tables and fields, one is less likely to see the comment when it really matters. For example, when adding fields to a query using the query by example (QBE) tool, only the field name is shown in the interface. So, if we were to add the following comment to our IsActive field, it won't be seen in the places where it would be most useful:
IsActive: DEPRECATED! use InactivatedOn field instead
Schema changes are less likely to be under version control
If you are relying on the most popular StackOverflow answer for version control in Access, then you are not capturing any changes that you make to the underlying database schema. My expanded version of the StackOverflow script does export field definitions, but I would imagine many/most Access version control solutions do not do this.
If those changes do not make it into version control, then they won't appear as the source of potential bugs when you are reviewing your application's development history.
Moral of the Story
Removing deprecated fields from a production database can be a royal pain. It can add significant time in the short-term when making schema changes, but that investment dwarfs the long-term pain you will feel if you put it off. By removing an old field, you guarantee that it can't be accidentally referenced moving forward.
Failing loudly is always better than misleading silently.