Exporting Queries for Version Control

SaveAsText will export queries to text files, but calling those files "human-readable" is a stretch.

Exporting Queries for Version Control

I've written several posts expanding on this StackOverflow answer to the question, "How do you use version control with Access development?"  Today I want to talk about exporting queries.

The original decompose.vbs script exports the QueryDef objects using the undocumented SaveAsText procedure.  While this creates a text file, calling it "human-readable" is probably a stretch.  The main purpose of the exported text file is to be able to re-create the QueryDef object upon running the LoadFromText procedure.  That's a critical role.  I'm not going to suggest we should replace this portion of the decompose.vbs script.  Rather, we should expand on it.

The SaveAsText output

Here are two sample outputs from the SaveAsText procedure:

HistoryForm.qry:

Operation =1
Option =0
Begin InputTables
    Name ="CMRLoan"
    Name ="HistorySubQuery"
End
Begin OutputColumns
    Expression ="CMRLoan.LoanType"
    Expression ="CMRLoan.Inv_Code"
    Expression ="CMRLoan.Account"
    Expression ="CMRLoan.CUSIP"
    Expression ="HistorySubQuery.Date"
    Expression ="HistorySubQuery.Price"
    Expression ="HistorySubQuery.Fraction"
    Expression ="CMRLoan.Active"
    Expression ="CMRLoan.LoanID"
    Alias ="HasHistory"
    Expression ="Not IsNull(HistorySubquery.LoanID)"
End
Begin Joins
    LeftTable ="CMRLoan"
    RightTable ="HistorySubQuery"
    Expression ="CMRLoan.LoanID=HistorySubQuery.LoanID"
    Flag =2
End
Begin OrderBy
    Expression ="CMRLoan.Account"
    Flag =0
End
dbBoolean "ReturnsRecords" ="-1"
dbInteger "ODBCTimeout" ="60"
dbByte "RecordsetType" ="0"
dbBoolean "OrderByOn" ="0"
dbByte "Orientation" ="0"
dbByte "DefaultView" ="2"
Begin
    Begin
        dbText "Name" ="HasHistory"
    End
End
Begin
    State =0
    Left =73
    Top =105
    Right =999
    Bottom =578
    Left =-1
    Top =-1
    Right =915
    Bottom =244
    Left =0
    Top =0
    ColumnsShown =539
    Begin
        Left =172
        Top =6
        Right =270
        Bottom =180
        Top =0
        Name ="CMRLoan"
        Name =""
    End
    Begin
        Left =308
        Top =6
        Right =507
        Bottom =120
        Top =0
        Name ="HistorySubQuery"
        Name =""
    End
End

PostingsReport.qry:

dbMemo "SQL" ="SELECT Receipts.NP_Date, Receipts.Accr_Date, CMRLoan.Account, Receipts.Type, Rec"
    "eipts.Principal, Receipts.Interest, Receipts.Service, CMRLoan.Category, [Balance"
    "]-[Principal] AS RemBalance, Particip.Name AS PartName, LoanType.Desc AS LnTypeD"
    "esc, [Batch_Date] Is Not Null AS Posted, Receipts.IsPayoff, CMRLoan.LoanID, Rece"
    "ipts.Apply_Date, Receipts.ReceiptID, CMRLoan.Balance\015\012FROM ((Receipts INNE"
    "R JOIN CMRLoan ON Receipts.LoanID = CMRLoan.LoanID) INNER JOIN LoanType ON CMRLo"
    "an.LoanType = LoanType.ID) INNER JOIN Particip ON CMRLoan.Inv_Code = Particip.In"
    "v_Code\015\012WHERE (((Receipts.Apply_Date)=[Forms]![InputReceipts]![DatePicker]"
    ") AND ((CMRLoan.LoanType)=[Forms]![InputReceipts]![LoanType]) AND ((CMRLoan.Inv_"
    "Code)=IIf([Forms]![InputReceipts]![Inv_Code]=\" - ALL -\",[CMRLoan].[Inv_Code],["
    "Forms]![InputReceipts]![Inv_Code])));\015\012"
dbMemo "Connect" =""
dbBoolean "ReturnsRecords" ="-1"
dbInteger "ODBCTimeout" ="60"
dbByte "RecordsetType" ="0"
dbText "Description" ="Used exclusively to populate the Pre-Posting Report"
dbBoolean "OrderByOn" ="0"
dbByte "Orientation" ="0"
dbByte "DefaultView" ="2"
Begin
    Begin
        dbText "Name" ="PartName"
        dbInteger "ColumnWidth" ="3180"
        dbBoolean "ColumnHidden" ="0"
    End
    Begin
        dbText "Name" ="RemBalance"
    End
    Begin
        dbText "Name" ="Posted"
    End
End

I want to point out a few things about the above files.

Window Layout Information

Many of the lines in the files are dedicated to where the query object should appear within the Access window.  This is just noise.  It conveys no information about the functionality of the query.

Inconsistent File Format

Access has a drag-and-drop interface for building queries, but not all joins support this feature.  The first example above does not export any SQL.  Instead it lists the tables, joins, columns, sorts (ORDER BY), and filters (WHERE clauses) that comprise the query.  The second example does not support drag-and-drop, so the entire SQL is exported as a "dbMemo" field.  

Consider what would happen if we changed the equal sign (=) in the join clause of the first query to a greater-than sign (>).  Access would no longer be able to represent the join in the drag-and-drop interface.  Thus, the exported query would switch to using the "dbMemo" approach.  In other words, changing a single character in the SQL string would appear to version control to be an entirely different query.

Hard line breaks

The SaveAsText formats all enforce a strict maximum line length in their exported files.  To comply with this constraint, any strings that are too long to fit in a single line are unceremoniously chopped into pieces, without regard for natural word breaks or the English rules of hyphenation.

For example, in the second example, "Receipts.Principal" is split into "Rec" and "eipts.Principal".  As I've discussed in the past, one of the advantages to exporting Access objects to text files for version control is the enhanced searchability.  I often use TortoiseHg's regex search to find all instances of a table field throughout my entire application.  Of course, if I searched for "Receipts" in the PostingsReport.qry file I would not get a result back for the line where "Receipts" is split in the middle.

Exporting raw SQL

The obvious way to improve on the above problems is to export the raw SQL for each query.  The QueryDef object has a .SQL property.  We could simply save that property to a text file.  Here's what that looks like:

PostingReport.sql:

SELECT Receipts.NP_Date, Receipts.Accr_Date, CMRLoan.Account, Receipts.Type, Receipts.Principal, Receipts.Interest, Receipts.Service, CMRLoan.Category, [Balance]-[Principal] AS RemBalance, Particip.Name AS PartName, LoanType.Desc AS LnTypeDesc, [Batch_Date] Is Not Null AS Posted, Receipts.IsPayoff, CMRLoan.LoanID, Receipts.Apply_Date, Receipts.ReceiptID, CMRLoan.Balance
FROM ((Receipts INNER JOIN CMRLoan ON Receipts.LoanID = CMRLoan.LoanID) INNER JOIN LoanType ON CMRLoan.LoanType = LoanType.ID) INNER JOIN Particip ON CMRLoan.Inv_Code = Particip.Inv_Code
WHERE (((Receipts.Apply_Date)=[Forms]![InputReceipts]![DatePicker]) AND ((CMRLoan.LoanType)=[Forms]![InputReceipts]![LoanType]) AND ((CMRLoan.Inv_Code)=IIf([Forms]![InputReceipts]![Inv_Code]=" - ALL -",[CMRLoan].[Inv_Code],[Forms]![InputReceipts]![Inv_Code])));

HistoryForm.sql:

SELECT CMRLoan.LoanType, CMRLoan.Inv_Code, CMRLoan.Account, CMRLoan.CUSIP, HistorySubQuery.Date, HistorySubQuery.Price, HistorySubQuery.Fraction, CMRLoan.Active, CMRLoan.LoanID, Not IsNull(HistorySubquery.LoanID) AS HasHistory
FROM CMRLoan LEFT JOIN HistorySubQuery ON CMRLoan.LoanID=HistorySubQuery.LoanID
ORDER BY CMRLoan.Account;

This is definitely an enhancement over only exporting the query object via SaveAsText.  But there is still a problem with this approach.

Version control systems use the patch file format.  This format shows changes as additions and deletions of entire lines.  The problem with the raw SQL above should be obvious now.  Any change to a field in the SELECT portion of the query will trigger an add and delete of the line that contains every field.  So what can we do about that?

Adding line breaks

I addressed this issue by re-formatting the SQL string to add line breaks to my exports.  Here are the improved versions:

PostingReport.sql (with line breaks added)

SELECT Receipts.NP_Date,
 Receipts.Accr_Date,
 CMRLoan.Account,
 Receipts.Type,
 Receipts.Principal,
 Receipts.Interest,
 Receipts.Service,
 CMRLoan.Category,
 [Balance]-[Principal] AS RemBalance,
 Particip.Name AS PartName,
 LoanType.Desc AS LnTypeDesc,
 [Batch_Date] Is Not Null AS Posted,
 Receipts.IsPayoff,
 CMRLoan.LoanID,
 Receipts.Apply_Date,
 Receipts.ReceiptID,
 CMRLoan.Balance
FROM ((Receipts
 INNER JOIN CMRLoan
 ON Receipts.LoanID = CMRLoan.LoanID)
 INNER JOIN LoanType
 ON CMRLoan.LoanType = LoanType.ID)
 INNER JOIN Particip
 ON CMRLoan.Inv_Code = Particip.Inv_Code
WHERE (((Receipts.Apply_Date)=[Forms]![InputReceipts]![DatePicker])
  AND ((CMRLoan.LoanType)=[Forms]![InputReceipts]![LoanType])
  AND ((CMRLoan.Inv_Code)=IIf([Forms]![InputReceipts]![Inv_Code]=" - ALL -",[CMRLoan].[Inv_Code],[Forms]![InputReceipts]![Inv_Code])));

HistoryForm.sql (with line breaks added)

SELECT CMRLoan.LoanType,
 CMRLoan.Inv_Code,
 CMRLoan.Account,
 CMRLoan.CUSIP,
 HistorySubQuery.Date,
 HistorySubQuery.Price,
 HistorySubQuery.Fraction,
 CMRLoan.Active,
 CMRLoan.LoanID,
 Not IsNull(HistorySubquery.LoanID) AS HasHistory
FROM CMRLoan
 LEFT JOIN HistorySubQuery
 ON CMRLoan.LoanID=HistorySubQuery.LoanID
ORDER BY CMRLoan.Account;

Now if we add, remove, or change one of the fields, sorts, filtering clauses, or join clauses, then that single change will appear by itself in the file diff.

Sample Code

'Txt: the .SQL property of an Access QueryDef object
Public Function AddLineBreaks(Txt) 'As String
    Dim s
    s = Txt
    s = Replace(s, " INNER JOIN ",  vbCrLf & " INNER JOIN ")
    s = Replace(s, " LEFT JOIN ",  vbCrLf & " LEFT JOIN ")
    s = Replace(s, " ON ", vbCrLf & " ON ")
    s = Replace(s, " AND ", vbCrLf & "  AND ")
    s = Replace(s, " OR ", vbCrLf & " OR ")
    s = Replace(s, ", ", "," & vbCrLf & " ")
    AddLineBreaks = s
End Function

Image by Gerd Altmann from Pixabay

All original code samples by Mike Wolfe are licensed under CC BY 4.0