VIDEO: "Codifying" Long Names
Former Access MVP Jack Stockton translates my SQL Server "Codify()" function to work with Microsoft Access backend databases.
Several months ago, I wrote an article about creating shortened codes from long descriptions:
A SQL Server Solution
To make it easier for others to implement, I posted the T-SQL to create a custom function in SQL Server named "Codify". The function followed these rules to convert a long description into a short, 2-5 character code (the length is customizable):
- Use the first letter from each word
- Replace instances of 'and' with '&'
- Don't allow a trailing '&'
- If only one word, return the first 2-5 letters in the word
An MS Access Solution
The big problem with the solution I posted is that it only worked on SQL Server data.
Former longtime Access MVP, Jack Stockton, has rectified that problem by writing an equivalent function that runs in VBA. What's more, Jack presented his solution to a meeting of the Pacific Access User Group meeting, so now there is video of the solution in action.
Point of Clarification: Uniqueness Not Guaranteed
During the presentation, one of the viewers asked a question about whether the function is guaranteed to create unique codes.
The short answer to that question is No.
First off, it's not possible to guarantee unique codes that are created from longer strings due to the pigeonhole principle. To make matters worse, the function has no memory. Like a spin of the roulette wheel, it is not influenced by previous calls.
One could try to account for previously generated codes, but that would be a monumental effort. The costs would far outweigh the benefits.
My preferred approach is to tailor the character code length based on the number of source items. With a half-dozen items, a single-character code may suffice. With several hundred categories, you will want a four- or five-character code to minimize collisions.
Ultimately, you should use the Codify function only as a starting point. I mentioned this explicitly in the original article:
To be clear, this function only provides a starting point for creating lookup table codes. It won't do anything to prevent duplicate abbreviations. Not every abbreviation it creates will be the best fit for the description it goes with. It's main purpose is to jump start the process.
BONUS MATERIAL: DocTests()
Jack also covered one of my favorite concepts: Documentation Tests. He covers the concept at the 12:21 mark of his presentation.
I love the doc test concept for three reasons:
- They let you learn by example.
- The documentation doesn't go stale.
- Writing tests is simple and easy.