Tutorial: Racket FFI, part 3

:: Racket, FFI, tutorial, by Asumu Takikawa

This is part 3 of my tutorial for using the Racket FFI. You can find part 1 here and part 2 here.

In this post, we will experiment with some low-level operations with pointers, union types, and custom C types. The main takeaway will be the custom C types, which let you define abstractions that hide the details of the C representation when manipulating data in Racket.

As in the second post, let’s start with some prologue code that establishes the definitions from the previous two posts. But first, I’m getting tired of writing the #:c-id identifier notation for the underscored C function names.

Instead, let’s use a third-party package that I wrote that lets you avoid the boilerplate. To install the package, you can either invoke the following incantation in a command-line:

$ raco pkg install ffi-definer-convention

or you can just execute the following snippet in Racket:

> (require pkg)
> (pkg-install-command #:skip-installed #t "ffi-definer-convention")

This will install the package and compile its contents. If you’re curious, the docs for the package are available here.

Note: if you’ve never installed a package before, you may want to glance at the package system docs. A tl;dr of packages is that they bundle Racket collections, which are sets of modules that you can refer to in a location-independent fashion such as pict or racket/list.

Anyhow, here’s the prologue code:

#lang racket
(require racket/draw
         ffi/unsafe
         ; avoid conflict with below
         (except-in ffi/unsafe/define
                    define-ffi-definer)
         ; the new 3rd-party pkg
         ffi-definer-convention
         pict)
 
; C types
(define-cpointer-type _cairo_t)
(define-cpointer-type _cairo_surface_t)
(define _cairo_line_cap_t
  (_enum '(butt round square)))
 
(define cairo-lib (ffi-lib #f))
(define-ffi-definer define-cairo cairo-lib
  ; describes how to transform from
  ; Racket to C ids
  #:make-c-id convention:hyphen->underscore)
 
; the foreign functions
; note lack of #:c-id keyword arguments
(define-cairo cairo-create
  (_fun _cairo_surface_t -> _cairo_t))
(define-cairo cairo-move-to
  (_fun _cairo_t _double _double -> _void))
(define-cairo cairo-line-to
  (_fun _cairo_t _double _double -> _void))
(define-cairo cairo-set-line-width
  (_fun _cairo_t _double -> _void))
(define-cairo cairo-stroke
  (_fun _cairo_t -> _void))
(define-cairo cairo-set-line-cap
  (_fun _cairo_t _cairo_line_cap_t -> _void))
 
; (_cairo_t -> Void) -> Pict
; do some drawing and give us the pict
(define (do-cairo f)
  (define bt (make-bitmap 256 256))
  (define bt-surface (send bt get-handle))
  (f (cairo-create bt-surface))
  (linewidth 2 (frame (bitmap bt))))

Notice that the define-cairo forms don’t have any #:c-id keywords anymore. Instead, the prologue code uses an overriden define-ffi-definer from my package that supports a #:make-c-id keyword that lets you specify a naming convention to follow.

Also, instead of creating a single bitmap and drawing into it, we now have a do-cairo function that takes a drawing function. When called, do-cairo will call the given function with a new bitmap object and return the result.

Now let’s get to the main point of this blog post. Let’s say that we want to play with Cairo path objects this time. A path is defined as a struct with the following structure:

typedef struct {
    cairo_status_t status;
    cairo_path_data_t *data;
    int num_data;
} cairo_path_t;

To manipulate paths, we want to define a FFI C type that corresponds to this struct definition. But before that, it’s useful to define C types for the types of values in the path struct’s fields. First, let’s specify that a cairo_status_t is an integer type:

> (define _cairo_status_t _int)

It’s actually an enum, but for the examples in this post we don’t care about distinguishing different statuses. Next, the data field of a path struct is an array of path data objects. Each path data object is a cairo_path_data_t, which is specified with a C union:

union _cairo_path_data_t {
    struct {
        cairo_path_data_type_t type;
        int length;
    } header;
    struct {
        double x, y;
    } point;
};

Helpfully, the FFI library comes with support for unions with the _union type constructor. The constructor takes arbitrarily many arguments, one for each sub-case in the union. It’s pretty straightforward to specify this type too:

; the path data type is just an enum
> (define _cairo_path_data_type_t
    (_enum '(move-to line-to curve-to close-path)))
> (define _cairo_path_data_t
    (_union ; the header case
            (_list-struct _cairo_path_data_type_t
                          _int)
            ; the point case
            (_list-struct _double _double)))

There’s a new type constructor here so let me explain that first. The _list-struct constructor translates between a C struct and a fixed-length list of C objects on the Racket side. Unlike define-cstruct, this constructor doesn’t define any selectors or anything like that. Instead, you can manipulate the struct as an ordinary list.

Each of the path data structs in the path data array will be manipulated with the _cairo_path_data_t type. Union types are a bit cumbersome unfortunately because the programmer has to distinguish the cases in the union manually on the Racket-side. Let me illustrate this with some code:

; create a union from a list of doubles
> (define a-union-val
    (cast (list 1.3 5.8)
          ; source type
          (_list-struct _double _double)
          ; target type
          _cairo_path_data_t))
> a-union-val

#<union>

This snippet first construct a union object (via the _cairo_path_data_t type) using a cast. A cast is an operation that lets you coerce from one C type to another. We use it in this example since it’s an easy way to generate a union object.

The second line shows that a union prints as an opaque object. You can’t do anything with a union in Racket unless you project it to one of the sub-cases with the union-ref function. This projection is unsafe, in the sense that if you don’t know which of the sub-cases in the union is the correct one, you will get potentially non-sensical data out of the union.

More concretely, let’s see what happens if we try to extract a value out of the union both correctly and incorrectly:

; correct (matches construction)
; cases are zero-indexed and ordered as written
> (union-ref a-union-val 1)

'(1.3 5.8)

; incorrect, error
> (union-ref a-union-val 0)

enum:int->_cairo_path_data_type_t: expected a known

#<ctype:ufixint>, got: 3435973837

Note that in the incorrect case we get an error saying that the FFI failed to convert the C value to a Racket value following the given C type. We were lucky in this case, but in general you can have silent failures where the data is nonsense.

With union types like these, there is usually some way to figure out which case of the union you are in. This may be accomplished in C using an extra struct field or a variable that indicates the variant. Alternatively, there may be some set order that cases appear in data structures.

With this Cairo API in particular, the position of the elements in the array tells you which of the union cases it’s in. The array always starts with a header element, and then follows with some number of data elements (the exact number is determined by the type indicated in the header). We can therefore reference the appropriate element of the union based on this ordering.

So before moving on, let’s recap: so far we have made C types that describe the data elements in a cairo path with unions. Next we’ll figure out how to deal with the array itself.

Some low-level operations

Since we still don’t have a C type for cairo_path_t, let’s go ahead and make a simple one where we punt on the work of specifying the array type:

> (define _simple_cairo_path_t
    (_list-struct _cairo_status_t
                  _pointer
                  _int))

In this type, we have specified the array as a bare _pointer. For some added safety, we could also use something like (_cpointer 'cairo_status_t), which sets up a tagged pointer type like we saw in the first blog post with define-cpointer-type.

We’ve seen the _pointer type before, but haven’t actually done anything with values of those types except pass them around as arguments. It turns out it is possible to do a bit more with pointers.

Before we get to that, let’s go ahead and set up an FFI binding for cairo_copy_path so that we can obtain a path struct to manipulate:

> (define-cairo cairo-copy-path
    (_fun _cairo_t -> _pointer))
> (define a-path #f)
> (do-cairo (λ (ctx)
              ; Do stuff to make the current
              ; path non-empty
              (cairo-move-to ctx 50.0 50.0)
              (cairo-line-to ctx 206.0 206.0)
              (cairo-move-to ctx 50.0 206.0)
              (cairo-line-to ctx 115.0 115.0)
              ; Get the current path
              (set! a-path (cairo-copy-path ctx))
              ; Stroke clears the path
              ; so do it last
              (cairo-stroke ctx)))

image

> a-path

#<cpointer>

Note that cairo-copy-path gives us a pointer to a path struct rather than a path struct directly. Because of that, we need to know how to manipulate pointers. The most useful function for pointers is ptr-ref, which lets you dereference a pointer and access it at some concrete C type.

Note: the ptr-ref function also takes an optional offset argument which we will be used in an example later.

For example, we can use a-path as a _simple_cairo_path_t:

> (define simple-path
    (ptr-ref a-path _simple_cairo_path_t))
> simple-path

'(0 #<cpointer> 8)

And now we have a Racket representation of the struct that the pointer points to. Now notice that the data array field of the struct is also a pointer as we specified earlier. To convert this to a more useful form, we can use ptr-ref again with an array type:

> (define array
    (ptr-ref ; the pointer
             (second simple-path)
             (_array/list _cairo_path_data_t
                          ; length field
                          (third simple-path))))
> array

'(#<union> #<union> #<union> #<union> #<union> #<union> #<union> #<union>)

The elements of the array are all unions, as we would expect. This is a bit annoying to use though. We have to know the structure of the array and reference the correct variant appropriately:

> (union-ref (first array) 0)

'(move-to 2)

> (union-ref (second array) 1)

'(50.0 50.0)

; nonsense data here, wrong union case
> (union-ref (third array) 1)

'(4.2439915824246e-314 0.0)

One thing we could do is write a helper function that converts this array into a more useful format. It would look at each header, and then consume the number of data elements specified in the header element (e.g., 1 in the example above because the length includes the header) and convert them appropriately.

An alternative is to define a custom C type that handles all of this conversion automatically for us, so that as a user of the Cairo FFI bindings we don’t need to think about applying helper functions and dereferencing pointers.

Custom C types

I briefly remarked on how to create custom C types in the first blog post, but let me go over that again in more detail. A custom C type is constructed by providing a base C type to use along with two conversion functions. The first function converts from a Racket value to a value that fits the base C type. The second converts in the other direction from a value of the base C type to a Racket value.

In this way, it’s possible to conduct interesting conversions, such as dereferencing union objects automatically.

Now let’s make a custom C type for Cairo paths that will represent the data elements as a sequence in which each item in the sequence is a list with an action symbol followed by the data elements for that action.

First, we’ll start by defining a struct type for the Racket representation of Cairo paths:

> (struct cairo-path (ptr)
    #:property prop:sequence
    (λ (p) (in-cairo-path p)))

The representation will store one field ptr which, as the name suggests, will store a pointer value. We’ll see what to do with this pointer later.

This definition uses a structure type property to make instances of cairo-path automatically work as sequences. This means that you can iterate over them with a for loop or apply sequence-ref on them. The property takes a function that takes an instance of the struct type itself (here p) and that returns a sequence.

We’ll later define the in-cairo-path function that will actually construct the relevant sequence for us. For now, let’s see how to construct the C type given this struct type:

> (define _cairo_path_t
    (let ()
      ; Extract pointer out of representation
      (define (racket->c rkt)
        (cairo-path-ptr rkt))
      ; Just apply the Racket constructor
      (define (c->racket cobj)
        (cairo-path cobj))
      (make-ctype _pointer
                  racket->c
                  c->racket)))

The base type for this _cairo_path_t is a _pointer type. Since the Cairo API returns pointers to new path values, it’s hard to avoid using some kind of pointer type as the base type here.

This definition right-hand-side defines the two conversion functions between Racket and C. Both are very simple because of how we’ve set up the representation. In the Racket to C case, we simply extract the pointer field of the struct. In the other direction, we just stuff the pointer into a struct.

The real work is done by the helper function that makes a cairo-path instance work as a sequence.

Starting top-down, let’s look at the definition of in-cairo-path:

; Cairo-Path -> Sequence
> (define (in-cairo-path path)
    (define pp (cairo-path-ptr path))
    (match-define
      (list _ array-ptr len)
      (ptr-ref pp _simple_cairo_path_t))
    (make-do-sequence
      (λ ()
        (values (pos->element array-ptr)
                (next-pos array-ptr)
                0
                (λ (pos) (< pos len))
                #f #f))))

The first thing the function does is extract the pointer out of the representation, and then immediately calls ptr-ref on it. This lets us manipulate the C path struct using the simple representation we defined in the first part of the blog post.

Note: in case you’re not very familiar with Racket pattern matching, the match-define form lets you define potentially multiple variables using a pattern, similar to Haskell or OCaml’s let statement. The first argument clause is a pattern and the second is an expression to match on. Check it out in the docs.

After extracting the array pointer and the array length from the path value, we pass them onto some helper functions that define the sequence. The usual way to define a new kind of sequence is to use the make-do-sequence function. Essentially, make-do-sequence takes a bunch of arguments that specify how to get the an element of a sequence, how to advance a sequence, how to start, and how to end the sequence.

Note: technically make-do-sequence actually takes a thunk which produces a number of values. These values are effectively like arguments though. The reason why it’s a thunk is that you may wish to run some initialization code that runs when the sequence is started (e.g., imagine opening a network connection), and your sequence functions (like advancing the sequence) may depend on the result of that initialization code.

In our case, we supply some curried functions that can extract elements out of the underlying C array. Here is the pos->element function and its helpers:

; CPointer -> Integer -> Element
> (define ((pos->element ptr) pos)
    ; Extract the data path header
    (define header
      (union-ref
       (ptr-ref ptr _cairo_path_data_t pos)
       0))
    (define type   (first header))
    ; Length includes header, so subtract 1
    (define len    (sub1 (second header)))
    (define pos*   (add1 pos))
    (define points (get-points ptr pos* len))
    (cons type points))
; CPointer Integer Integer -> (Listof Data)
> (define (get-points ptr pos num-points)
    (for/list ([i (in-range num-points)])
      (union-ref (ptr-ref ptr
                          _cairo_path_data_t
                          ; offset argument
                          (+ pos i))
                 1)))

This code encodes the API usage protocol that Cairo specifies, where each header element in the path is followed by some number of data elements. Each header specifies the length, so we can loop in get-points from the position after the header until we reach the given length. At each point, we dereference the appropriate union element.

Advancing the sequence is simpler, since all we need to do is some arithmetic on the length given by header elements:

> (define ((next-pos ptr) pos)
    (define header
      (union-ref
       (ptr-ref ptr _cairo_path_data_t pos)
       0))
    (define len (second header))
    (+ len pos))

Note that determining the end of the sequence is very easy. It’s just a matter of comparing the current position to the total length given in the path struct, encoded in the expression (λ (pos) (< pos len)).

Now we can try using a path as a sequence:

> (define-cairo cairo-copy-path
    (_fun _cairo_t -> _cairo_path_t))
> (do-cairo (λ (ctx)
              (cairo-move-to ctx 50.0 50.0)
              (cairo-line-to ctx 206.0 206.0)
              (cairo-move-to ctx 50.0 206.0)
              (cairo-line-to ctx 115.0 115.0)
              (define path (cairo-copy-path ctx))
              ; Using path as a sequence
              (for ([elem path])
                (displayln elem))
              (cairo-stroke ctx)))

(move-to (50.0 50.0))

(line-to (206.0 206.0))

(move-to (50.0 206.0))

(line-to (115.0 115.0))

image

Notice how the sequence prints out as an intuitive list of commands instead of a bunch of opaque union values as we saw before when using the _array/list type.

That concludes part 3 of the FFI tutorial. Hopefully you’re now equipped to deal with union types and custom C types. If not, see the FFI reference for more details on unions and custom C types.

Thanks to Ben Greenman for suggestions/feedback and to Sam Tobin-Hochstadt for suggesting to cover union types!