The Typed Racket Optimizer vs. Transient

:: typed racket, offsite, by Ben Greenman

What type-directed optimizations does Typed Racket perform and do any require full types?

This post is based on a short talk. Slides from the talk are here: http://ccs.neu.edu/home/types/resources/talks/prl-offsite-2019.pdf

Standard Typed Racket guarantees full type soundness and uses higher-order contracts to make sure that interactions between Typed Racket and untyped Racket obey the types. These contracts can be very expensive [JFP 2019]. And so, the standard types are very strong but (possibly) slow.

Lately, I’ve been working on a transient back-end for Typed Racket. Transient Typed Racket provides a weaker guarantee — only that typed code cannot get “stuck” — via simpler run-time checks. Early data shows that these simple checks are often faster than the standard boundary checks [ICFP 2018], hence we want both options for Typed Racket programmers: slow/correct and fast/wrong.

The implementation of Transient needs to re-use some parts of Standard Typed Racket and modify others. Typed Racket comes with three major components:

  1. a static type checker,
  2. a compiler from types to contracts, and
  3. a type-driven optimizer [PADL 2012, OOPSLA 2012].

Transient Typed Racket can re-use all of the type checker and parts of the type-to-contract compiler. The question for this post is: can Transient re-use the optimizer?

Q. Can Transient re-use the Typed Racket optimizer?

The answer requires some thought because Standard Typed Racket and Transient Typed Racket preserve different amounts of type information.

  • In Standard Typed Racket, if an expression e has type T and reduces to a value v (for short, e : T —>* v), then the result v definitely matches the full type T.
  • In Transient Typed Racket, if e : T —>* v then the result v matches the toplevel “shape” of T but (maybe) nothing more.

The idea of a “shape” is that it corresponds to the outermost constructor of a type. A shape check must be decidable, but otherwise finding the best shape for a type is an engineering challenge. On one hand, deeper checks give stronger guarantees. On the other hand, shallower checks are quicker to validate.

Here are a few shapes according to the current Transient prototype:

  Shape(Natural)                = Natural
  Shape(Listof String)          = Listof Any
  Shape(Symbol -> Boolean)      = Any -> Any
  Shape(Vector Void Void)       = Vector Any Any
  Shape(U Void (Listof Symbol)) = U Void (Listof Any)

For the current shapes, can we re-use the Typed Racket optimizer?

Optimization Topics

Typed Racket implements 15 kinds of type-directed transformation. Below, each gets: a short description, an example, and a verdict of “safe” or “unsafe” for Transient.

To be clear: some optimization topics perform many kinds of transformations, but this post picks only one example transformation for each.


Topic 1: apply

apply.rkt “inlines” expressions of the form (apply f (map g xs)) to map and fold in one pass over the list (xs). Currently, the pass only triggers when f is + or *.

Example

  ;; Type Assumptions
  (: xs (Listof Integer))

  ;; --------------------------------------------------
  ;; Before Optimization
  (apply + (map abs xs))

  ;; --------------------------------------------------
  ;; After Optimization
  (let loop ((v 0)
             (lst xs))
    (if (null? lst)
      v
      (loop (+ v (abs (unsafe-car lst)))
            (unsafe-cdr lst))))

Verdict: safe, but risky.

Technically, this transformation is unsound for Transient because of how it uses unsafe-car. The expansion of (apply * (map g xs)) applies (g (unsafe-car xs)) without confirming that the first element of xs matches its expected type. This unsoundness is no problem, though, as long as every Transient-typed function checks the shape of its input. (Typed functions that flow to untyped code already need to check inputs.)


Topic 2: box

box.rkt safely applies unsafe box operations to expressions with Box type.

Example

  ;; Type Assumptions
  (: b (Boxof Symbol))

  ;; --------------------------------------------------
  ;; Before Optimization
  (unbox b)

  ;; --------------------------------------------------
  ;; After Optimization
  (unsafe-unbox b)

Verdict: safe


Topic 3: dead-code

dead-code.rkt uses type information to identify code that cannot run. Once identified, the TR optimizer makes the dead code obvious for the Racket bytecode compiler. The pass deals with if expressions, lambda expressions, and case-lambda; the latter is the most interesting for Transient.

Example

  ;; Type Assumptions
  (: f (-> Symbol Symbol)

  ;; --------------------------------------------------
  ;; Before Optimization
  (define f
    (case-lambda
      ((s) s)
      ((s i)
       (for/list ((_i (in-range i))) s))))

  ;; --------------------------------------------------
  ;; After Optimization
  (define f
    (case-lambda
      ((s) s)
      ((s i)
       ; dead code, replace with no-op
       (void))))

Verdict: unsafe, can change behavior

The pass infers that some branches of a case-lambda can never run because the type says they do not exist. In Standard Typed Racket, this inference is correct because a run-time contract seals off the “untyped” branches. In Transient, though, there is no need to add a contract and therefore no guarantee these branches are inaccessible. An application in untyped code can enter the dead branch; if it does, then adding Transient types to part of a program can change its result to (void) and thereby violate the graduality design goal [SNAPL 2015, ICFP 2018] — that is, that adding types should only change behavior by introducing runtime type mismatches.


Topic 4: extflonum

extflonum.rkt safely applies unsafe extflonum operations to expressions with Extflonum type.

Example

  ;; Type Assumptions
  (: e Extflonum)

  ;; --------------------------------------------------
  ;; Before Optimization
  (extflabs e)

  ;; --------------------------------------------------
  ;; After Optimization
  (unsafe-extflabs e)

Verdict: safe


Topic 5: fixnum

fixnum.rkt safely applies unsafe fixnum operations to expressions with Fixnum type.

Example

  ;; Type Assumptions
  (: f Fixnum)

  ;; --------------------------------------------------
  ;; Before Optimization
  (exact->inexact f)

  ;; --------------------------------------------------
  ;; After Optimization
  (unsafe-fx->fl f)

Verdict: safe


Topic 6: float-complex

float-complex.rkt unboxes complex numbers (into one real-part variable and one imaginary-part variable) and rewrites operations to handle the unboxed numbers.

Example

  ;; Type Assumptions
  (: f (-> Float-Complex Float-Complex Float-Complex))

  ;; --------------------------------------------------
  ;; Before Optimization
  (define (f n0 n1)
    (+ n0 n1))

  ;; --------------------------------------------------
  ;; After Optimization
  (define (f n0 n1)
    (let* ((unboxed-real-0 (unsafe-flreal-part n0))
           (unboxed-imag-0 (unsafe-flimag-part n0))
           (unboxed-real-1 (unsafe-flreal-part n1))
           (unboxed-imag-1 (unsafe-flimag-part n1))
           (unboxed-real-2 (unsafe-fl+ (real->double-flonum unboxed-real-0)
                                       unboxed-real-1))
           (unboxed-imag-2 (unsafe-fl+ (real->double-flonum unboxed-imag-0)
                                       unboxed-imag-1)))
      (unsafe-make-flrectangular unboxed-real-2 unboxed-imag-2)))

Verdict: safe, with caution

The body of a Transient-typed function (that can flow to untyped code) must first check that its inputs have the correct shape. Currently, the float-complex pass creates functions that apply unsafe-flreal-part before anything else; to be safe, the pass needs to make sure that Transient checks come first.


Topic 7: float

float.rkt safely applies unsafe flonum operations to expressions with Flonum type and also transforms some random calls to use unsafe-flrandom.

Example

  ;; --------------------------------------------------
  ;; Before Optimization
  (random)

  ;; --------------------------------------------------
  ;; After Optimization
  (unsafe-flrandom (current-pseudo-random-generator))

Verdict: safe, but a close call

Accessing a parameter, as in (current-pseudo-random-generator), is an elimination form that may require a shape check. This particular parameter, however, is protected by a contract that enforces the precondition of unsafe-flrandom.


Topic 8: list

list.rkt safely applies unsafe list operations to list expressions.

Example

  ;; Type Assumptions
  (: lst (List Symbol Symbol))

  ;; --------------------------------------------------
  ;; Before Optimization
  (list-ref lst 0)

  ;; --------------------------------------------------
  ;; After Optimization
  (unsafe-list-ref lst 0)

Verdict: safe, with strong-enough shape checks

The shape check for a (Listof T) must check for proper lists (via list?); note that the cost of this check depends on the size of incoming values. The shape check for a (List T ...) type must validate the length of incoming values.


Topic 9: number

number.rkt performs simple transformations on Real-valued expressions.

Example

  ;; Type Assumptions
  (: r Real)

  ;; --------------------------------------------------
  ;; Before Optimization
  (+ r)

  ;; --------------------------------------------------
  ;; After Optimization
  r

Verdict: safe


Topic 10: pair

pair.rkt safely applies pair-access operations to (possibly-nested) pairs.

Example

  ;; Type Assumptions
  (: p (Pairof (Pairof Symbol Void) String))

  ;; --------------------------------------------------
  ;; Before Optimization
  (cdar p)

  ;; --------------------------------------------------
  ;; After Optimization
  (unsafe-cdr (unsafe-car p))

Verdict: unsafe

Transient guarantees the first level of a type, but nothing more. Concretely, Shape(Pairof (Pairof Symbol Void) String) = Pairof Any Any and so the unsafe-cdr above is not safe.


Topic 11: sequence

sequence.rkt safely applies unsafe sequence operations to expressions with (Sequenceof T) type.

Example

  ;; Type Assumptions
  (: s String)

  ;; --------------------------------------------------
  ;; Before Optimization
  (for ((c s))
    (void))

  ;; --------------------------------------------------
  ;; After Optimization (simplified)
  (for ((c (in-string s)))
    (void))

Verdict: safe, with strong enough shape checks (see list and vector)


Topic 12: string

string.rkt safely applies unsafe string operations to expressions with String type. (Note that unsafe-string-ref is only safe when the result is sure to be a Latin–1 character.)

Example

  ;; Type Assumptions
  (: b Bytes)

  ;; --------------------------------------------------
  ;; Before Optimization
  (bytes-length b)

  ;; --------------------------------------------------
  ;; After Optimization
  (unsafe-bytes-length b)

Verdict: safe


Topic 13: struct

struct.rkt safely applies unsafe struct operations to struct expressions, using Typed Racket’s internal registry of struct info.

Example

  ;; Type Assumptions
  (struct open-interval ([lo : Real] [hi : Real]))
  (: ivl open-interval)

  ;; --------------------------------------------------
  ;; Before Optimization
  (open-interval-lo ivl)

  ;; --------------------------------------------------
  ;; After Optimization
  (unsafe-struct-ref ivl 0)

Verdict: safe


Topic 14: unboxed-let

unboxed-let.rkt cooperates with the float-complex pass by transforming the binding-site of some complex numbers. This pass may change a let-expression into a let-values that expects a real-part and imag-part, and may change a function to expect twice as many arguments — provided the optimizer can find all calls to the function.

Example

  ;; Type Assumptions
  (: k Float-Complex)

  ;; --------------------------------------------------
  ;; Before Optimization
  (let ((f (lambda ((n : Float-Complex)) (+ n n))))
    (f k))

  ;; --------------------------------------------------
  ;; After Optimization
  (let ((f (lambda (real-part-n imag-part-n) ....)))
    (f (unsafe-flreal-part k) (unsafe-flimag-part k)))

Verdict: safe, thanks to the (conservative) escape analysis


Topic 15: vector

vector.rkt safely applies vector operations to vector expressions.

Example

  ;; Type Assumptions
  (: v (Vector (Listof Symbol) String))
  (: lst (Listof Symbol))

  ;; --------------------------------------------------
  ;; Before Optimization
  (vector-set! v lst 0)

  ;; --------------------------------------------------
  ;; After Optimization
  (unsafe-vector-set! v lst 0)

Verdict: safe, with strong-enough shape checks

The check for (Vector T ...) must check the length of incoming values.


Summary

The Typed Racket optimizer implements 15 kinds of transformations. Two are definitely unsafe for Transient as-is (dead-code, pair). One must take care when rewriting a Transient function (float-complex). One may limit our ability to reduce the number of run-time checks in a program (apply). Two others require transient checks whose cost depends on the size of the input values (list, sequence).

There may be other issues that I missed while reading the optimizer code. If so, I’ll try to remember to update this post.