DIMACS

Lance Tan
Yale University
Email: lance (dot) tan (at) yale (dot) edu

Project: Accelerate your network interface card (NIC) to process data blazingly fast
Mentor: Srinivas Narayana


Abstract

The network interface card (NIC) is the "glue" between a computer (say your laptop, or a datacenter) and the rest of the Internet. The NIC sits in the path of the data moving in and out of your machine, allowing it to compute over the data as it transits. For example, a NIC might automatically compress data going in and out of your machine to save bandwidth and reduce download time. NICs have gotten increasingly flexible in the computations they support over the years. In this project, I am exploring (i) ways to get programs of interest up and running on real high-speed NICs (running at 100 Gigabit/s), and (ii) methods to optimize NIC program speed. This work will help further understanding of the capabilities and limitations of hardware acceleration using NICs.


Log

Week 1: 5/25 - 5/31

This summer, I'll be working on formally specifing the instruction set and architecture for the Netronome NFP-6000 network interface controller, as part of the superoptimizing compiler that the research group is developing. This week, I've been reading the Netronome programmer's manual and other documents to learn the architecture and instruction set, as a first step towards writing and offloading programs of interest onto the NIC.

Week 2: 6/1 - 6/7

Netronome NICs can be programmed in a variant of C called "Micro-C", and I've learned the workflow for how to write, compile, and load Micro-C programs onto the NIC. There is an intermediate step in this process that generates ".list" files, which appear to contain the program compiled into Netronome microcode, and other directives; we've decided to focus on optimizing these files using the superoptimizing compiler. However, the tool that actually loads programs onto the NIC isn't working, and I'm attempting to debug why.

Week 3: 6/8 - 6/14

I'm still having trouble loading programs onto the NIC--it seems that the Netronome software toolchain that I need to install won't compile on the host computer because of a weird interaction between the toolchain and the host kernel. In the meantime, I'm studying the Netronome ISA itself by compiling some example micro-C programs into .list files and comparing the C code to the microcode instructions it corresponds to. The microcode instruction set seems highly stateful--for example, certain ALU operations implicitly use the output of the previous ALU operation as an input argument.

Week 4: 6/15 - 6/21

I'm continuing to study the Netronome ISA through examples. I'm discovering more and more cases where microcode instructions interact with program state in complicated or unintuitive ways--for example, the behavior of certain bit shift operations depends on the value of a special register that must be set in advance. It's looking like the complexity and high statefulness of the instruction set means that it will be challeging to build a compiler that models it. Next week, I'm going to begin contributing to the group's superoptimizing compiler framework, starting with the ability to interpret/formalize a small subset of ALU operations.

Week 5: 6/22 - 6/28

I have started building the Netronome ISA interpreter/verifier--it can now evaluate the NOP instruction! (As in, the instruction that does nothing and has no effects. But I guess you have to start somewhere :) Next, I'm going to implement the instruction for writing constant values to registers, followed by ALU (arithmetic) operations from least to most complicated.

Week 6: 6/29 - 7/5

I've completed about a dozen more ALU operations (such as addition, subtraction, boolean operations). I'm currently working on interpreting and formalizing the remaining two ALU operations, which involve either adding or subtracting the carry-out bit of the previous ALU operation. To do this, I need to add a carry-out bit to the interpreter's model of program state, and learn how to express the corresponding variable in Z3.

Week 7: 7/6 - 7/12

I implemented the "+carry" operation. Computing the carry bit was slightly more difficult because it required some more complex Z3 expressions, which unline the other ALU operations, aren't really similar to the corresponding C operations for interpretation. I also learned that Netronome ISA arithemtic is unsigned, and re-implemented the other ALU operations to more accurately simulate this.

Week 8: 7/13 - 7/19

I've integrated my formalizations into the rest of the optimizer/compiler framework, so it can validate whole programs and formally compare whether two Netronome programs are equivalent to each other or not. This is pretty exciting! I've also put together the machinery for synthesizing new programs--an important part of the superoptimizing compiler--but this part doesn't quite work yet.

Week 9: 7/20 - 7/24

This is (officially) the last week of the program, so I'm preparing a presentation and a written report of what I've completed. Overall, I'm quite satisfied with what I was able to accomplish this summer, and how much I got to learn. I'm excitited to see this project continue to be build out and applied to advance the state of NIC programming in the future.


Other Things