Translators: How simplified?

In my interpreter, the code looks like this

x=(y+4)*z
echo x

      

analyzes and "optimizes" up to four separate operations performed by the interpreter, roughly assembled:

add 4 to y
multiply <last operation result> with z
set x to <last operation result>
echo x

      

  • In modern interpreters (eg CPython, Ruby, PHP), how simplified are the "opcodes" for which the final effect is performed by the interpreter?

  • Can I get better performance when trying to keep structures and commands for the interpreter more complex and high-level? It would be much more difficult or?

+2


a source to share


4 answers


In the case of Python, you can tell it the bytecode for a given function with dis .

from dis import dis
def foo():
    x=(y+4)*z
    print x

dis(foo)

      

gives you:



2           0 LOAD_GLOBAL              0 (y)
            3 LOAD_CONST               1 (4)
            6 BINARY_ADD          
            7 LOAD_GLOBAL              1 (z)
           10 BINARY_MULTIPLY     
           11 STORE_FAST               0 (x)

3          14 LOAD_FAST                0 (x)
           17 PRINT_ITEM          
           18 PRINT_NEWLINE       
           19 LOAD_CONST               0 (None)
           22 RETURN_VALUE        

      

Some are extraneous (like LOAD_CONST and RETURN_VALUE at the end for implicit return None

in foo()

), but Python seems to push y and 4 onto the stack, add, push z, multiply and write x. Then he presses x and prints

+1


a source


  • About the last result: effectively, you created a logger with one register: "last result of operation". It is a blocker for parallelism.
  • Eval / assign kind of opcodes is usually a layer lower than yours. Look at Python opcodes .
  • Higher level commands can provide better performance as they can allow you to spend more time inside a (hopefully) fast interpreter. But they can also be a pain, because you also need a different high-level opcode to do so.

Give it a try and see :) it really depends on a lot of factors that you didn't provide (and the scope and challenge is so big that if you provide sufficient factors, it will contain some obvious answers). One such major factor is what (and how) you are going to implement some language functionality (or, in other words, if you are going to make these things first class), for example:



  • structured programming. I mean, opcodes contain the concept of "function call", "function call with N arguments" or it will be a blind push, push, push, call, ret sequence.
  • lambdas (especially - function bodies defined inside other functions). Which will put a closure decision: whether functions should "capture" the outer scope variables and, if so, how.
+1


a source


Try to simulate your opcodes as if they mimic the inner workings of your interpreter. This page contains an article on how .NET generates an interpreted language from regular expressions. In .NET, a regular expression is first compiled into an intermediate language. This intermediate code will then be interpreted. The middleware is very similar to the internal data structures of a specific, uhh, regex engine.

+1


a source


Rule of thumb: If you have duplicate patterns in your bytecode (for example, a common pattern for each GC-controlled heap allocation), there must be a special high-level operation for each pattern.

Anyway, now, with all the features of .NET, JVM, LLVM, it's very cheap and easy to plug in a proper JIT compiler if you're really interested in your interpreter's performance.

0


a source







All Articles