Geoffroy Couprie - @gcouprie - Rust as a support language

Devoxx 2016: Rust as a support language

Who

rust main language for nearly a year now

we're pushing some Rust in production now ssh jail, proxy, git subsystems

why do we write code in C?

- speed: not looking for a fight here, some things are faster depending on the language or the amount of work, whatever floats your boat - memory usage: avoiding GC? garbage collection is cool, but sometimes, you work a lot to avoid it (off heap arrays, etc) - portability: Objective C, Android, desktop?

why should we avoid C?

- a pain to compile - managing memory manually is hard - crashes and vulnerabilities - handling FFI is annoying

Hello Rust!

- looking for a better language to write browsers in - high level tools for low level programming - what is system programming? - great community, nice people

Language design

code examples: functions

fn add_one(x: i32) -> i32 {
  x + 1
}

code examples: structs

struct Point {
    x: i32,
    y: i32,
}

let origin = Point { x: 0, y: 0 };

code examples: enums

pub enum Option<T> {
    None,
    Some(T),
}

pub enum Result<T, E> {
    Ok(T),
    Err(E),
}

match str::from_utf8(data) {
  Ok(s)  => println!("got a valid string: {}", s),
  Err(e) => println!("got an error: {:?}", e)
}

code examples: traits

struct Circle {
    x: f64,
    y: f64,
    radius: f64,
}

trait HasArea {
    fn area(&self) -> f64;
}

impl HasArea for Circle {
    fn area(&self) -> f64 {
        std::f64::consts::PI * (self.radius * self.radius)
    }
}

code examples: closures

let plus_one = |x: i32| x + 1;
let plus_two = |x| {
    let mut result: i32 = x;
    result += 2;

    result
};

code examples: borrow checker

fn foo(v: &Vec<i32>) {
     v.push(5);
}

let v = vec![];

foo(&v);
// error: cannot borrow immutable borrowed content `*v` as mutable
// v.push(5);
// ^

Memory safe

- no GC but memory safe - borrow checking - memory vulns: don't fix bugs, fix bugclasses - zero cost abstractions

a young language

we use it at clever cloud!

- multiple systems rewritten (everybody writes rust here) - other larger components - can deploy a rust app!

Rust support in beta !

$ npm install -g clever-tools
$ clever login
$ clever create --type rust iron-test
$ cd iron-app && clever link iron-test
$ clever deploy

why Rust over C for native extensions?

- memory safe: less possibilities for vulns - less memory leaks (still possible, but hard to trigger). A memleak is not a memory safety issue - less crashes - nice build system, easy to add libs

easy to embed in other languages

they want GC aware bindings for Ruby

lower barrier to entry to write production code

for messing with stuff, C will be easier at first, but hard to get right

why Rust over Java?

why Rust over Java?

let's build something!

One library for iOS and Android

build in Objective C, Java (Python for server) VS build once in Rust - example idea extracted from a client project - needed to write a new implementation of a communication protocol - code was in Java - building 2 libs, iOS and Android, means lots of debug - potentially server side as well - make one implementation, consistent with itself, ship it to every platform

Write once, build everywhere

Let's build an inverted index!

- basic tool to build a text search - let's do a naive implementation - get the data from the devoxx CFP app (yay for nicolas martignole) - index the title and summary

what is an inverted index?

"Hello, how are you?", "Fine, and you?", "Great!"

=> {
    "hello" => [0], "how" => [0], "are" => [0], "you" => [0,1],
    "fine" => [1], "and" => [1], "great" => [2]
  }

what is an inverted index?

normalize("Hello,") => "hello"

search("hello")     => [0]
search("you")       => [0,1]
search("hello you") => [0]

in java

HashMap<String, HashSet<Integer>> index;

public void             insert(Integer id, String data)
public HashSet<Integer> searchWord(String word)
public Set<Integer>     searchString(String s)

Inserting

public void insert(Integer id, String data) {
     String[] words   = data.split(" ");
     for (String _word : words) {
         String w = _word.toLowerCase().replaceAll("\\p{Punct}", "");
         HashSet s = index.get(w);
         if(s == null) {
             s = new HashSet<Integer>();
             index.put(w, s);
         }
         s.add(id);
    }
}

Search one word

public HashSet<Integer> searchWord(String word) {
    HashSet<Integer> res = index.get(word);

    if(res == null) {
        return new HashSet<Integer>();
    }
    return res;
}

Search a string

public Set<Integer> searchString(String s) {
    String[] words   = s.split(" ");
    String w0 = words[0].toLowerCase().replaceAll("\\p{Punct}", "");
    HashSet<Integer> res = searchWord(w0);

    for(int i = 1; i < words.length; i++) {
        String wi = words[i].toLowerCase().replaceAll("\\p{Punct}", "");
        res.retainAll(searchWord(wi));
    }

   return res;
}

Obtaining the data

curl -H "Accept: application/json"
  http://cfp.devoxx.be/api/conferences/DV16/schedules/monday > monday.json
{
  "slots": [
    {
      "roomId": "room4",
      "notAllocated": false,
      "fromTimeMillis": 1478507400000,
      "break": null,
      "roomSetup": "theatre",
      "talk": {
        "trackId": "mobile",
        "talkType": "University",
        "track": "Mobile & Embedded",
        "summaryAsHtml": "<p>JavaFX 8 offers a rich set of visually appealing UI controls, a convenient property binding mechanism for event handling, and a powerful tool for layout and design. This session shows you how to develop mobile applications with Java and JavaFX and deploy them on IOS and Android devices, all from the same code base. The session begins with an overview of JavaFX UI controls and property binding techniques. After demonstrating an example JavaFX application with FXML, Scene Builder,  and Gluon Maps, you’ll learn how to deploy the app on a mobile device (IOS or Android). You’ll also learn how to keep connected mobile devices updated when data in the cloud changes. This session includes several useful examples and shows how Java and JavaFX makes it easy to develop applications that target different mobile devices. This write once, run anywhere (WORA) platform uses an IDE plug in and library support from Gluon to create native applications.</p>\n",
        "id": "WSB-5493",
        "speakers": [
          {
     [...]

Fast forward to an Android app

Let's do some Rust!

The plan

The Rust test code

pub fn add(a: i32, b: i32) -> i32 {
  a + b
}

JNI/JNA

the JNI way

#[allow(unused_variables,non_snake_case)]
#[no_mangle]
pub extern fn Java_com_inrustwetrust_Rust_add(jre: *mut JNIEnv,
  class: *const c_void, a: c_int, b: c_int) -> c_int {
  a+b
}

The JNA way

#[no_mangle]
pub extern fn add(a: i32, b: i32) -> i32 {
  a + b
}

The Java side, JNI

package com.inrustwetrust;

public class Rust {
    static {
        System.loadLibrary("index");
    }
    public static native int add(int v1, int v2);
}

The Java side, JNA

package com.inrustwetrust;

import com.sun.jna.Library;
import com.sun.jna.Native;
import com.sun.jna.NativeLibrary;
import com.sun.jna.Pointer;

public interface Rust extends Library {
    String JNA_LIBRARY_NAME = "index";
    NativeLibrary JNA_NATIVE_LIB = NativeLibrary.getInstance(JNA_LIBRARY_NAME);
    Rust INSTANCE = (Rust) Native.loadLibrary(JNA_LIBRARY_NAME, Rust.class);

    int add(int a, int b);
}

Build system

the inverted index in Rust

pub struct Index {
  pub index: HashMap<String, HashSet<i32>>,
}
impl Index {
  pub fn insert(&mut self, id: i32, data: &str);
  pub fn search_word(&self, word: &str) -> Option<&HashSet<i32>>;
  pub fn search(&self, text: &str) -> HashSet<i32>;
}

Insert

lazy_static! {
  static ref RE: Regex = Regex::new(r"[:punct:]").unwrap();
}

impl Index {
  pub fn insert(&mut self, id: i32, data: &str) {
    for word in data.split_whitespace() {
      let w = RE.replace_all(word, "").to_lowercase();

      if self.index.contains_key(&w) {
        self.index.get_mut(&w).map(|h| h.insert(id));
      } else {
        let mut h = HashSet::new();
        h.insert(id);
        self.index.insert(w, h);
      }
    }
  }
}

Let's make a C interface

#[no_mangle]
pub extern fn index_create() -> Box<Index> {
  Box::new(Index::new())
}
#[no_mangle]
pub extern fn index_free(_: Box<Index>) {
}

#[no_mangle]
pub extern fn index_insert(index: &mut Index, id: i32, raw_text: *const c_char) {
  let slice = unsafe { CStr::from_ptr(raw_text).to_bytes() };
  ìf let Ok(text) = str::from_utf8(slice) {
    index.insert(id, text);
  }
}

#[no_mangle]
pub extern fn index_count(index: &Index) -> i32 {
  index.index.keys().count() as i32
}

Java side

public interface Rust extends Library {
    String JNA_LIBRARY_NAME = "index";
    NativeLibrary JNA_NATIVE_LIB = NativeLibrary.getInstance(JNA_LIBRARY_NAME);
    Rust INSTANCE = (Rust) Native.loadLibrary(JNA_LIBRARY_NAME, Rust.class);

    int     add(int a, int b);
    Pointer index_create();
    void    index_free(Pointer index);
    void    index_insert(Pointer index, int id, String text);
    int     index_count(Pointer index);
}

Testing

index = new Index();
for(int i = 0; i < talks.size(); i++) {
  index.insert(i, talks.get(i).title);
  index.insert(i, talks.get(i).summary);
}

rustIndex = Rust.INSTANCE.index_create();
for(int i = 0; i < talks.size(); i++) {
  Rust.INSTANCE.index_insert(rustIndex, i, talks.get(i).title);
  Rust.INSTANCE.index_insert(rustIndex, i, talks.get(i).summary);
}

assert Rust.INSTANCE.index_count(rustIndex) == index.getIndex().size()

Needs more tests

Search in Rust

impl Index {
  pub fn search_word(&self, word: &str) -> Option<&HashSet<i32>> {
    self.index.get(word)
  }
}

Search string in Rust

impl Index {
  pub fn search(&self, text: &str) -> HashSet<i32> {
    let mut split = text.split_whitespace();
    let first = split.next().map(|word| {
      let w = RE.replace_all(&word, "").to_lowercase();
      self.search_word(&w)
    });
    let res: HashSet<i32> = if let Some(Some(h)) = first {
      h.clone()
    } else {
      HashSet::new()
    };

    split.fold(res, |set, ref word| {
      let w = RE.replace_all(&word, "").to_lowercase();
      self.index.get(&w).map(|h| h.intersection(&set).cloned().collect())
        .unwrap_or(HashSet::new())
    })
  }
}

How to pass an integer set as result?

Let's make a new type

pub struct SearchResult {
  data: Vec<i32>
}
#[no_mangle]
pub extern fn search_result_count(search: &SearchResult) -> i32 {
  search.data.len() as i32
}

#[no_mangle]
pub extern fn search_result_get(search: &SearchResult, index: i32) -> i32 {
  *search.data.get(index as usize).unwrap()
}

#[no_mangle]
pub extern fn search_result_free(_: Box<SearchResult>) {
}

Using SearchResult

#[no_mangle]
pub extern fn index_search(index: &Index, raw_text: *const c_char)
  -> Box<SearchResult> {
  let slice = unsafe { CStr::from_ptr(raw_text).to_bytes() };
  let text = str::from_utf8(slice).unwrap_or("");
  let h = index.search(text);
  let v: Vec<i32> = h.iter().cloned().collect();

  Box::new(SearchResult {
    data: v
  })
}

Java side

Pointer index_search(Pointer index, String text);
int     search_result_count(Pointer result);
void    search_result_free(Pointer result);
int     search_result_get(Pointer result, int index);

Searching

Log.d("search", "searching for \""+query+"\"");
final Set<Integer> res = index.searchString(query);

Pointer searchResult = Rust.INSTANCE.index_search(rustIndex, query);

HashSet<Integer> rustResult = new HashSet<>();

Integer rustResultNumber = Rust.INSTANCE.search_result_count(searchResult);
for(int i=0; i < rustResultNumber; i++) {
  Integer j = Integer.valueOf(Rust.INSTANCE.search_result_get(searchResult, i));
  rustResult.add(j);
}
Rust.INSTANCE.search_result_free(searchResult);

Log.d("search", "java found "+res.size()+" talks");
Log.d("search", "rust found "+rustResultNumber+" talks");

Testing

11-08 14:16:59.909 9782-10663/com.inrustwetrust.search D/search: searching for "java"
11-08 14:16:59.924 9782-10663/com.inrustwetrust.search D/search: java found 83 talks
11-08 14:16:59.924 9782-10663/com.inrustwetrust.search D/search: rust found 85 talks

Wait, what?

Let's see

11-09 16:14:01.926 26223-29280/com.inrustwetrust.search D/search: index["java"] ==
  [165, 223, 36, 125, 23, 172, 111, 82, 42, 52, 17, 127, 213, 7, 195, 144, 177, 2,
   30, 122, 120, 196, 3, 202, 57, 61, 222, 105, 169, 86, 163, 32, 219, 38, 9, 40, 129,
   96, 31, 184, 137, 210, 114, 186, 45, 33, 166, 28, 88, 218, 168, 50, 154, 117, 205,
   56, 98, 87, 26, 116, 112, 126, 143, 8, 121, 78, 194, 175, 181, 18, 133, 0, 99, 141,
   106, 89, 39, 158, 100, 211, 209, 198, 178]
11-09 16:14:01.956 26223-29280/com.inrustwetrust.search D/search: index["java
                                                                  
                                                                        lets"] == [75]
11-09 16:14:02.056 26223-29280/com.inrustwetrust.search D/search: index["java
                                                                  
                                                        in"] == [52]

we have a working implementation, running in Java!

Benchmarks

224 talks, indexing title and summary

Indexing Searching ("java") Searching ("java test")
java ~ 10s ~600 μs ~1200 μs
rust ~ 1s ~300 μs ~ 400 μs

You can do some Rust too!

More info

if you want some stickers...

Thanks!