Implement compiled mode for Perlang
In https://github.com/perlang-org/perlang/discussions/396, I described the recent events leading up to me trying out what LLVM can do for us, in terms of making it possible to run Perlang programs completely independent of the .NET platform.
After that comment was written, and some discussions I had with an old friend of mine (@diwic - thanks a lot to you!
I'm setting the milestone for this to 0.4.0, but naturally, given the sheer size of this task, the compiler will in no way be complete in 0.4.0. But it'll probably work to the point where I feel comfortable about pushing it out to the public.
Rough steps
-
Implement a compiler which translates the syntax tree for all/most valid Perlang programs into valid C++ code, and compiles and runs the result: https://github.com/perlang-org/perlang/pull/409 -
Implement a C/C++-based stdlib to support the above: !407 (merged). - Make it possible to write unit and/or integration tests for this. We probably have to write these in C or C++ for now. cmocka is a useful unit test library for C that I have used elsewhere.
-
Add support for BigInt: #415 (closed) -
Distribute the (compiled) stdlibalong with snapshot builds. This involves a bit of complexity, since native C++ code has historically only been able to compile on the same platform as the CI job is running on. We'll need to investigate ifclangmakes this easier for us.- In line with the next point, I think it's fine if we are Linux and
amd64-only at this point (in compiled mode). In other words, we'll provide a Linuxamd64binary of thestdlibfor now and emit an error message on other platforms stating that experimental compilation is not yet supported.
- In line with the next point, I think it's fine if we are Linux and
- We will keep things simple in the 0.4.0 milestone and only support compiled mode on Linux. This makes the above easier. Going forward, we'll need to start building releases separately on each platform (i.e. build macOS on a macOS CI runner, build Linux binaries on Linux and so forth). I'll create a separate issue for this at some point and add a link to it here.
- Implemented as of !445 (merged), with the above limitation (Linux-only).
-
Make sure PerlangCompileruses thestdlibartifacts (.so/.afiles and.h/.hppheader files), when being executed from a snapshot build.- The only thing that will prevent this from happening is if
$PERLANG_ROOTis set.$PERLANG_ROOTis still used when running Perlang from source, so let's leave this as-is for now.
- The only thing that will prevent this from happening is if
-
Once this is stable enough, consider dropping interpreted mode (to avoid having to always make "two implementations" for all new functionality going into the library). Challenge: this will make it hard/impossible to support the REPL though, so ideally we would keep this until we can reimplement the REPL on top of LLVM instead. -
I am currently (2023-11-03) leaning towards dropping (parts of) the REPL soon, perhaps in the 0.5.0 or 0.6.0 release. This will make things simpler and free us from having to keeping it working all the time, since it won't be working in compiled anyway (for quite a long time, realistically speaking). Once the Perlang compiler is mature enough to be able to interface with LLVM to generate machine code for an arbitrary Perlang expression tree, we can reimplement the REPL on top of this.
Suggested approach: make some "glue tooling" for interfacing between Perlang and C++ (and perhaps between Perlang and C# in the intermediate stage), so that we can expose the Perlang AST types to a little C++ helper library. The helper library will then consume the LLVM headers and emit machine code for the Perlang AST.
- REPL and
-eoption dropped in !446 (merged) and !447 (merged).
- REPL and
-
-
Figure out how to answer hard questions, like how to cast an ASCIIStringtoString(https://github.com/perlang-org/perlang/pull/451/files#r1548516040)- Fixed (or worked around) by !453 (merged), which should be "good enough" for now. As the compiler matures (and we can eventually move away from relying too much on C++), we can rework this to use more stack-based
ASCIIStringinstances where possible, to reduce the number of heap allocations.
- Fixed (or worked around) by !453 (merged), which should be "good enough" for now. As the compiler matures (and we can eventually move away from relying too much on C++), we can rework this to use more stack-based
-
Implement some of the obvious missing string-related operations - Concatenation between
AsciiStringandint: !472 (merged), !473 (merged) - Concatenation between
AsciiStringandAsciiString: !470 (merged)
- Concatenation between
-
Make it possible to call methods from Perlang code - This is a limitation we currently have. You can not call
length()on an array for example, which is a quite important limitation that we need to address fairly soon.
- This is a limitation we currently have. You can not call
-
Implement some mechanism for multi-file projects (like a "build system" of some form, like MSBuild or cargo)- TODO: Definitely deserves an issue of its own. A quick-and-dirty approach could be to support a
perlang .orperlang <some-directory>approach, i.e. compile all files in a given directory; this seems to be similar to how https://vlang.io/ does it. The easy way here would be to just emit a single C++ file; if we do it like this, I think we can postpone the "build system" question for (perhaps much) later.
- TODO: Definitely deserves an issue of its own. A quick-and-dirty approach could be to support a
-
Implement a way to call Perlang code from C#, by compiling the Perlang code to one or more .so(subsequently.dllon Windows) files.- Has been started: !506 (merged), which builds on top of the functionality delivered in !462 (merged).
-
Implement a way to do "reverse P/Invoke", i.e. expose Perlang code as native functions for calling them from managed C# code. - This approach should work, i.e. relying on callbacks which can be converted into function pointers on the Perlang/C++ side: https://stackoverflow.com/questions/7970128/passing-a-c-sharp-callback-function-through-interop-pinvoke
- Covered by the above: !506 (merged) and !462 (merged).
-
Once the compiler is in place and we have the required mechanics for creating native libraries with Perlang, start planning on gradually rewriting the Perlang compiler in Perlang. The "easiest" way is probably to start rewriting some isolated part of it, and call into the Perlang (native) code from C#. - The bootstrapping can be done using a "stable" version of the "compile-via-C++" compiler.
- Once we have that bootstrapped, we can then subsequently move to depend on the first "stable" version which can compile to native code without any dependency on C++; our only dependency will be on the LLVM libraries at this point. (Challenge: consuming LLVM from non-C++ languages can be impractical. We might need to write some C++-based glue code in the Perlang compiler to make this happen, as described in one of the previous points.)
- Should also have an issue of its own: #454.