twinBASIC Update: July 30, 2023

Highlights include experimental code optimization via LLVM (boosting performance and reducing executable size) and a new Discord LLVM channel.

twinBASIC Update: July 30, 2023

On April 23, 2021, I helped Wayne Phillips introduce the world to twinBASIC at the Access DevCon Vienna conference.  I boldly predicted that twinBASIC (along with the Monaco editor) would replace VBA and its outdated development environment by 2025.  With that goal in mind, this weekly update is my attempt to keep the project fresh in the minds of the VBA development community.

Every Sunday, I will be providing updates on the status of the project, linking to new articles discussing twinBASIC, and generally trying to increase engagement with the project.  If you come across items that should be included here, tweet me @NoLongerSet or email me at mike at nolongerset dot com.

Here are some links to get involved with the project:


Highlights

Debut of LLVM Optimization

On July 22, 2023, Wayne posted the following in the Discord chat:

Updated the LLVM build to use LLVM 16 (from the ancient LLVM 3.4 I was previously using for testing).  A lot of the API has changed since v3.4, so it took some time to get it back up and running.   Same performance numbers using LLVM 16 for this test, but the code size produced now is reduced further to 825 bytes (from 1225 bytes when using LLVM 3.4).   That's a win

I can now test using SSE/AVX etc, but there was no meaningful change to the performance numbers for this particular test when all CPU features enabled.

Just FYI, the X86 build of the LLVM DLL for tB currently stands at 23MB.  This will likely be provided via a runtime download feature so that we don't have to provide these DLLs in the github releases, and the DLL can be cached locally so we don't need to download it for each new release.

Those of you with Ultimate edition licences will also be able to view the generated assembly via hovering over the procedure name as seen here

After Wayne posted the above screenshot, sokinkeso asked a perceptive followup question, "Is this also a 'promise' for future inline assembly for tB?"

To the delight of the hard-core performance crowd, Wayne confirmed it was very possible, though he stopped short of guaranteeing its future inclusion:

yes, LLVM supports that, so it can be offered.

And for those of you not particularly excited about LLVM integration, fear not.  Wayne is back to his regularly scheduled programming tomorrow (i.e., 2023-07-31):

[Wayne]: Normal (non-llvm) service to resume tomorrow.  The slight break to get the initial LLVM integration done will pay dividends in the end.

Discord Chat Summary

* Auto-generated via Claude-2-100k on poe.com

Here are some key highlights from the Discord conversation about the development of the twinBASIC programming language:

  • Wayne implemented an initial version of LLVM integration, allowing twinBASIC code to be compiled to optimized machine code. This resulted in big performance gains and smaller code size compared to the interpreter.
  • There was discussion around adding inline assembly support to twinBASIC, which would allow inserting assembly code directly in twinBASIC functions. This could be useful for things like optimization and low-level access in rare cases.
  • Wayne announced the LLVM integration will soon be available in a public test release, initially supporting basic data types like integers and floats. Strings, error handling and other features will come later.
  • There was talk about potential uses for twinBASIC like in schools/universities and for advanced programming, given its combination of an easy to use language with low level access features. An educational license was suggested.
  • People shared excitement about the LLVM integration and progress overall. The integration was called a big win that will pay dividends down the line.
  • Aside from LLVM work, Wayne mentioned he will shift focus to bug fixes for a while before getting back to new features.

Overall, the conversation highlighted ongoing progress in developing twinBASIC into a modern, multi-platform language with both high-level ease of use and low-level power.

[Follow-up prompt]: What kind of low-level access features does twinBASIC offer?

Based on the conversation, some of the key low-level access features that twinBASIC offers or plans to offer are:

  • Inline assembly support - This will allow directly inserting assembly code into twinBASIC functions, which could be useful for things like optimization, controlling hardware directly, and other cases needing low-level access.
  • LLVM integration - Compiling to optimized machine code via LLVM gives much better performance than a pure interpreter, and allows leveraging CPU features like SSE/AVX.
  • Manipulating memory directly via pointers - Features like PEEK and POKE allow directly accessing hardware memory from twinBASIC code. [Slight hallucination; see Editor's note below]
  • Calling native APIs directly - twinBASIC allows declaring and calling Windows and other native APIs, without intermediate layers.
  • User-defined types (UDTs) - For declaring structures and interacting with low-level APIs.
  • Support for data types like longlong and decimal - For precision and access to low-level data formats.
  • Multithreading support - For performance and low-level control.
  • Lack of intermediate bytecode - Compiles directly to machine code for performance.

So in summary, twinBASIC is designed to offer a combination of high-level features with low-level power through direct access to things like assembly, memory, and hardware. This distinguishes it from other basic variants.


Editor's note: The reference to PEEK and POKE in the AI-generated summary above appear to be incorrect.  In the Discord chat, fafalone mentioned that BASIC on the Commodore 64 had PEEK and POKE, as did Altair BASIC–an even earlier implementation of BASIC.  

fafalone used those examples to make the following point, "BASIC was never about keeping the programmer entirely isolated from the lower level features closer to the hardware. Just a historical perspective for thoughts that some of the low level features now or in the future with tB 'aren't BASIC-y."

Around the Web

New LLVM Discord Channel

In an effort to focus discussions moving forward, Wayne created an LLVM channel within the twinBASIC Discord Server.

Since I know many readers do not actively participate in the twinBASIC Discord Server, I'm going to reproduce Wayne's first several posts in the new channel that go into detail about the new LLVM features.

The text below (between the two horizontal rules) is all Wayne's.  To preserve Wayne's formatting and maximize readability, I'm not using my typical quote formatting.


As of BETA 363, professional/ultimate edition users can now use the experimental LLVM backend compiler for optimizing code.  Currently, support is limited to just the basic datatypes of Byte, Integer, Long, LongLong, Single and Double.  We are currently only shipping the 32-bit version of the LLVM compiler, so Win64 compilation is not yet possible. To get started, there is a new [ CompilerOptions("") ] attribute that you can apply to procedures in a standard module.  The passed string takes a series of flags separated by commas and prefixed with either +/-.   To enable LLVM optimized compilation of a procedure, use the following syntax:

    [ CompilerOptions ("+llvm,+optimize,+optimizesize") ]
    [ ArrayBoundsChecks (False) ]
    [ IntegerOverflowChecks (False) ]
    Public Function Test() As Long
        Return 123
    End Function

Please note: in future you'll be able to control these flags at the project level, and will be able to define them independently depending on whether you're building for release or for within the IDE.  There are lots of features not yet supported in our LLVM compiler backend.  Please monitor the DEBUG CONSOLE for error messages when you hit upon something that is not yet implemented.  You'll either get a nice message in the debug console, or a hard crash

Note that in the above example, we've turned off the array-bounds checks and integer-overflow checks, since error handling is not yet supported by our LLVM backend.

What's definitely not yet supported:  anything to do with classes, interfaces, variants, strings, or anything similar.
What is supported:  the basic integral and floating point types as noted above, arrays, global variables, UDTs, standard module function calls, and all control-flow statements (edited)

The CompilerOptions attribute allows for passing other flags to the LLVM backend.  In particular it allows us to refine the target CPU options.  These are the current available flags that control target CPU instruction sets: +aes,+avx,+avx2,+bmi2,+fma,+fxsr,+lzcnt,+popcnt,+rdseed,+sha,+sse,+sse2,+sse3,+sse4.1,+sse4.2,+ssse3,+xsave,+xsavec,+xsaveopt,+xsaves


Changelog

Here are the updates from the past week.  You can also find this information by visiting the GitHub twinBASIC Releases page.

Releases · WaynePhillipsEA/twinbasic
Contribute to WaynePhillipsEA/twinbasic development by creating an account on GitHub.

WARNING: The following issue is present in BETA builds 202 - 365 (the latest build as of publication):

  • there are known memory leaks in these versions, so memory usage will be higher than normal

BETA 363

  • improved: [HiddenModule].Int3Breakpoint now injects the breakpoint directly into the generated code, rather than via a call
  • added: very early experimental implementation of LLVM codegen for Professional/Ultimate licence holders [ see discord for info ]

BETA 364

  • improved: LLVM backend compiler now supports the Currency data type

BETA 365

  • improved: LLVM backend compiler now supports the Boolean data type

All original code samples by Mike Wolfe are licensed under CC BY 4.0