Python VM written in Swift
Violet
Violet is one of those Swift <-> Python interop thingies, except that this time we implement the whole language from scratch. Name comes from Violet Evergarden.
Many unwatched k-drama hours were put into this, so any ⭐ would be appreciated.
If something is not working, you have an interesting idea or maybe just a question, then you can start an issue or discussion. You can also contact us on twitter @itBrokeAgain (optimism, yay!).
Requirements
- 64 bit – for
BigInt
and (probably, maybe, I think) hash - Platform:
- macOS – tested on 10.15.6 (Catalina) + Xcode 12.0 (Swift 5.3)
- Ubuntu – tested on 21.04 + Swift 5.4.2
- Docker – tested on
swift:latest
(5.4.2) on Ubuntu 21.04
Features
We aim for compatibility with Python 3.7 feature set.
We are only interested in the language itself without additional modules. This means that importing anything except for most basic modules (sys
, builtins
and a few others) is not supported (although you can import other Python files).
See Documentation
directory for a list of known unimplemented features. There is no list of unknown unimplemented features though…
Future plans
Our current goal was to ramp up the Python functionality coverage, which mostly meant passing as many Python tests (PyTests
) as possible. This gives us us a safety net for any future regressions.
Next we will try to improve code-base by solving any shortcuts we took:
-
New object model (representation of a single Python object in a memory) – currently we are using Swift objects to represent Python instances, for example Swift
PyInt
object represents a Pythonint
instance. There are better ways to do this, but this is a bit longer conversation in Swift. For details see this issue. -
New method representation – currently we just wrap a Swift method in a
PyBuiltinFunction
and put it inside type__dict__
. For example:int.add
(implemented in Swift asPyInt.add(:_)
with following signature:(PyInt) -> (PyObject) -> PyResult<PyObject>
) is put insideint.__dict__
. This can be simplified a bit, but it depends on the object model, so it has to wait. -
Garbage collection and memory management – as we said: currently use Swift class instances to represent Python objects, which means that we are forced to use Swift ARC to manage object lifetime. Unfortunately this does not solve reference cycles (which we have, for example:
object
type hastype
type andtype
type is a subclass ofobject
, not to mention thattype
type hastype
as its type), but for now we will ignore this… (how convenient!). -
V8-style isolates – currently the Python context is represented as a global static
Py
(something like:Py.newInt(2)
orPy.add(lhs, rhs)
). This prevents us from having multiple VM instances running on the same thread (without using thread local storage), which in turn makes unit testing difficult.
Sources
Core modules
- VioletCore — shared module imported by all of the other modules.
- Contains things like
NonEmptyArray
,SourceLocation
, SipHash,trap
andunreachable
.
- Contains things like
- BigInt — our implementation of unlimited integers
- While it implements all of the operations expected of
BigInt
type, in reality it mostly focuses on performance of small integers — Python has only oneint
type and small numbers are most common. - Under the hood it is a union (via tagged pointer) of
Int32
(calledSmi
, after V8) and a heap allocation (magnitude + sign representation) with ARC for garbage collection. - While the whole Violet tries to be as easy-to-read/accessible as possible, this does not apply to
BigInt
module. Numbers are hard, and for some reason humanity decided that “division” is a thing.
- While it implements all of the operations expected of
- FileSystem — our version of
Foundation.FileManager
.- Main reason why we do not support other platforms (Windows etc.).
- UnicodeData — apparently we also bundle our own Unicode database, because why not…
Violet
- VioletLexer — transforms Python source code into a stream of tokens.
- VioletParser — transforms a stream of tokens (from
Lexer
) into an abstract syntax tree (AST
).- Yet Another Recursive Descent Parser with minor hacks for ambiguous grammar.
AST
type definitions are generated byElsa
module fromElsa definitions/ast.letitgo
.
- VioletBytecode — instruction set of our VM.
- 2-bytes per instruction.
- No relative jumps, only absolute (via additional
labels
array). Instruction
enum is generated byElsa
module fromElsa definitions/opcodes.letitgo
.- Use
CodeObjectBuilder
to createCodeObjects
(whoa… what a surprise!). - Includes a tiny peephole optimizer, because sometimes the semantics depends on it (for example for short-circuit evaluation) .
- VioletCompiler — responsible for transforming
AST
(fromParser
) intoCodeObjects
(fromBytecode
). - VioletObjects — contains all of the Python objects and modules.
Py
represents a Python context. Common usage:Py.newInt(2)
orPy.add(lhs, rhs)
.- Contains
int
,str
,list
and 100+ other Python types. Python object is represented as a Swiftclass
instance (this will probably change in the future, but for now it is “ok”, since the whole subject is is a bit complicated in Swift). Read the docs in theDocumentation
directory! - Contains modules required to bootstrap Python:
builtins
,sys
,_imp
,_os
and_warnings
. - Does not contain
importlib
andimportlib_external
modules, because those are written in Python. They are a little bit different than CPython versions (we have 80% of the code, but only 20% of the functionality <great-success-meme.gif>). PyResult<Wrapped> = Wrapped | PyBaseException
is used for error handling.
- VioletVM — manipulates Python objects according to the instructions from
Bytecode.CodeObject
, so that the output vaguely resembles whatCPython
does.- Mainly a massive
switch
over each possibleInstruction
(branch prediction ?).
- Mainly a massive
- Violet — main executable (duh…).
- PyTests — runs tests written in Python from the
PyTests
directory.
Tools/support
- Elsa — tiny DSL for code generation.
- Uses
.letitgo
files fromElsa definitions
directory. - Used for
Parser.AST
andBytecode.Instruction
types.
- Uses
- Rapunzel — pretty printer based on “A prettier printer” by Philip Wadler.
- Used to print
AST
in digestible manner.
- Used to print
- Ariel — prints module interface – all of the
open
/public
declarations.
Tests
There are 2 types of tests in Violet:
-
Swift tests — standard Swift unit tests stored inside the
./Tests
directory. You can run them by typingmake test
in repository root.You may want to disable unit tests for
BigInt
andUnicodeData
if you are not touching those modules:BigInt
— we went with property based testing with means that we test millions of inputs to check if the general rule holds (for example:a+b=c -> c-a=b
etc.). This takes time, but pays for itself by finding weird overflows in bit operations (we store “sign + magnitude”, so bit operations are a bit difficult to implement).UnicodeData
- In one of our tests we go through all of the Unicode code points and try to access various properties (crash -> fail). There are
0x11_0000
values to test, so… it is not fast. - We also have a few thousands of tests generated by Python. Things like: “is the
COMBINING VERTICAL LINE ABOVE (U+030d)
alpha-numeric?” (Answer: no, it is not. But you have to watch out becauseHANGUL CHOSEONG THIEUTH (U+1110)
is).
- In one of our tests we go through all of the Unicode code points and try to access various properties (crash -> fail). There are
-
Python tests — tests written in Python stored inside the
./PyTests
directory. You can run them by typingmake pytest
in repository root (there is alsomake pytest-r
for release mode).- Violet – tests written specially for “Violet”.
- RustPython – tests taken from github.com/RustPython.
Those tests are executed when you run
PyTests
module.
Code style
- 2-space indents and no tabs at all
- 80 characters per line
- Required
self
in methods and computed properties- All of the other method arguments are named, so we will require it for this one.
Self
/type name
for static methods is recommended, but not required.- I’m sure that they will depreciate the implicit
self
in the next major Swift version ?. All of that source breakage is completely justified.
- No whitespace at the end of the line
- Some editors may remove it as a matter of routine and we don’t want weird git diffs.
- (pet peeve) Try to introduce a named variable for every
if
condition.- You can use a single logical operator – something like
if !isPrincess
orif isDisnepCharacter && isPrincess
is allowed. - Do not use
&&
and||
in the same expression, create a variable for one of them. - If you need parens then it is already too complicated.
- You can use a single logical operator – something like
Anyway, just use SwiftLint and SwiftFormat with provided presets (see .swiftlint.yml and .swiftformat files).
License
“Violet” is licensed under the MIT License.
See LICENSE file for more information.