An IRC Server in Rust, part 1

June 8, 2015

rust
irc
code
networking

Rust is pretty cool. I don't hate writing it, like I would hate writing C++. I still get the performance benefits of a low level language. Plus it gives me the chance to work in a modern strongly typed language, without having to wrap my head around Haskell.

Of course the best way to learn a new language is to go off and write something in it^{[citation needed]}. A project I've been wanting to work on lately involves a custom made IRC server (I doubt any of the existing IRCds could do what I want out of the box). There will be another blog post with all the details about this, but for now I'd like to talk about my experience with building an IRC server in Rust.

How does IRC even work?

Before I can implement IRC in anything though, I need to know how it works. In theory, [RFC 2812][] should specify the line-level protocol, and should be enough to write a full client or server. However, for a few reasons, it isn't enough to just read the RFC.

The RFC was written a long time ago, and isn't exactly a modern presentation of a protocol. It focuses on some things that I wouldn't expect.
I'm lazy, and I'm just skimming the RFC and reading the parts that seem relevant. This makes it hard to get an over-arching picture of the protocol.
No one actually implements the spec. For example, irssi doesn't send the right parameters for the USER command (according to this RFC). The RFC defines the parameters as the system username, the mode of the user (as a number), an unused parameter, and the user's real name. Irssi sends USER mythmon mythmon localhost :Unknown. You made notice that "mythmon" is not a valid mode, which should be a number like 0 or 8. Yay, irssi.

So just reading the RFC isn't going to work. I need better ways. I could do something like bust out Wireshark and analyzing bytes on the wire and blah blah blah. There is an easier way though. IRC is a really simple protocol, and I can just type it out by hand, with a little discipline.

Enter nc. nc is netcat, and is approximately the simplest possible networking tool there is. We'll be using 2 of its modes, listening mode and client mode.i

There are two major kinds of netcat, and they aren't really compatible. There is one that is a part of the GNU utils, and one that is part of the BSD utils. For most utilities, I prefer GNU utils, but in this case I think the BSD version is more useful. In the examples below, I'm using the version of nc from the Arch package openbsd-netcat.

First, lets see what a client does when it first connects to a server. To do that, I'll use nc -l 6667 to act as a server, and connect irssi to it. Here is what irssi said:

NICK mythmon
USER mythmon mythmon localhost :Unknown

I like Weechat more than irssi normally, but irssi is a bit simpler to just connect to random servers, so I'm using it here.

Hmm. I'm not sure what to do with that. Lets see what a real IRC server says when I say those things to it. I'll use nc chat.freenode.net 6667 as a client to do this. After firing it up I'll type what irssi sent to me. The lines that start with > are what I type, and the lines that start with < are what Freenode sends back.

I've cut off a lot of output here. Freenode is actually quite noisy, and sent back about 50 lines, which aren't interesting to the point I'm making here.

> NICK mythmon
> USER mythmon mythmon localhost :Unknown
< :wolfe.freenode.net 001 mythmon :Welcome to the freenode Internet Relay Chat Network mythmon
<snip>
< :wolfe.freenode.net 375 mythmon :- wolfe.freenode.net Message of the Day -
< :wolfe.freenode.net 372 mythmon :- Welcome to wolfe.freenode.net in Stockholm, SE.
< :wolfe.freenode.net 372 mythmon :- Thanks to http://www.portlane.com/ for sponsoring
<snip>
< :wolfe.freenode.net 376 mythmon :End of /MOTD command.
< :mythmon MODE mythmon :+i

A combination of reading this and reading the RFC teaches me that there are about 4 parts of an IRC message:

The prefix is the first space-separated word, if it starts with a colon. This is optional. It represents the source of the message, and is optional.
The command is the first word after the prefix (or the first word if there is no prefix). Some commands are easy, like NICK (which changes a user's nick). Others are just numbers, so I'll have to consult the RFC for those. Though reading through the output, I bet 375 is "start of MOTD", 372 is "MOTD continues", and 376 is "end of MOTD".
The parameters are every word between the command and the trail (or the end of the message). They number of these varies based on the command.
The trail is a parameter that starts with a colon and continues to the end of the line. If it exists, it is the last parameter.

I suspect the important parts of this output were the 001 command, which I'm going to guess means something like "start of connection". So I'll start up a server again, connect irssi, and send a welcome message. This is nc -l 6667 again. Then I connected irssi, typed /j #test, sent a message in that channel, switched back to netcat and sent a response from an imaginaty user, and finally typed /disconnect in irrsi.

< NICK mythmon
< USER mythmon mythmon localhost :Unknown
> :localhost 001 mythmon :Welcome!
< MODE mythmon +i
< WHOIS mythmon
< PING localhost
> PONG :localhost
< JOIN #test
< PRIVMSG #test :hello?
> :localhost PRIVMSG #test :it works!
< QUIT :leaving

Cool. Now I can speak IRC. On a lark I loaded up nc in client mode again. I connected to Freenode, joined a channel, responded to ping, and sent a message. I can totally IRC from netcat. Wooh!

Now In Rust

I fumbled around for a while trying to use mio, because async IO is good, right? Well mio isn't really ready (the docs leave a lot to be desired) and is much lower level than I want to write for this project. So for now I'll stick with the easy blocking IO model.

At first I had a single threaded program that could listen on a port, accept a connection, and talk to one client at a time. That won't do: for an IRC server I'll need lots of incoming connection. I looked around for something like select to pull data from multiple TCPStreams at once. No luck. So That means one thread per client. Sad, but it will work.

Luckily, Rust makes threading really really easy, and surprisingly safe. Seriously, this is the best parallelism I've ever seen from a standard library. Nice job Rust.

The relevant bit of code looked something like this at one point:

let listener = TcpListener::bind("127.0.0.1.4567").unwrap_or_else(|e| {panic!(e) });

for stream in listener.incoming() {
    let stream = stream.unwrap();
    thread::spawn(move || {
        handle_client(stream).unwrap_or_else(|e| { panic!(e) });
    });
}

That handily spawns a new thread for every connection, passing the connection object between the threads in a thread safe way.

The .unwrap_or_else(|e| { panic!(e) }) bits are the best way I've found of dealing with errors that should terminate. Just calling .unwrap() will close the program, but leaves a lot to be desired in terms of debugging. It doesn't really tell you what happened, just that someone called panic inside unwrap of something. .unwrap_or_else(|e| { panic!(e) }) still panics, tearing down the thread, but it also puts the error in the right function so I can see what is failing, and sometimes e is even something useful.

Now handle_client can spin around with something like this, collecting lines from the client and doing things with them:

fn handle_client(mut stream: TcpStream) -> Result<()> {
    let reader = BufReader::new(try!(stream.try_clone()));
    for line in reader.lines() {
        let line = line.unwrap_or_else(|e| { panic!(e) });
        // do something with the line
    }
    Ok(())
}

Awesome. That wasn't too bad. BufReader is a nice wrapper around things that implement Read. It buffers things providing the ability to do things like read until the next line, or get a iterator over the lines. Great.

I really like Rust's traits system. Being able to use the same tool (BufReader) on anything that implements Read (files, network streams, more things I haven't found yet) is really neat.

After this I wrote some IRC parsing stuff. I tried to use rusty-peg, a crate that lets me make a real parser from a PEG grammar, but it didn't work very well. Actually at all. I couldn't get it to work. So I wrote a not-very good not-really-a-parser that just splits on spaces and does some other stuff. It works on valid data. I haven't tested invalid data yet.

I just noticed that rusty-peg just released a version 0.4. I'll have to go try it out again, if anything interesting changed.

Then I discovered that I needed even more threads. About twice as many. I'll get back to that in another blog post. The spoiler version is that there is a select-like thing available, but only over channels. And TcpStreams aren't channels. So I had to make a TcpStream-to-channel converter, and that had to run in a separate thread. sigh.

Wrap Up

I learned that networking in Rust is an immature field, and that I'll have to do a lot more work than I would in something like Python or Node. Still, it isn't as bad as it would be in C or C++, and I'm getting better at Rust.

In the future, I'll write some blog posts about easily converting objects to and from strings, parsing IRC messages, and communicating between threads with channels.