Sunday, January 13, 2013

C++11 Lambdas: how many statements can I squeeze in one line of code?

Learning a little C++11 by porting some old Boost ASIO programs to it. Lambdas are one of the new features - they have an interesting syntax for letting you define in great detail how you want to capture variables from the enclosing scope.  (How did C++ get lambdas and closures before Java by the way?).

You can combine lambdas with STL and fancy new builtin types like std::thread to get code like this:


    std::transform(
      m_ioServiceVector.begin(),
      m_ioServiceVector.end(),
      std::back_inserter(threadVector),
      [] (IoServicePtr pIOService)
      {
        return ThreadPtr(
          new std::thread([=] ()
            {
              pIOService->run();
            }));
      });

What on earth does this do?  Short summary is we've created a vector of boost::asio::io_service objects (m_ioServiceVector) and we want to create a thread to call run() on each io_service.  We also want to hang on to pointers to the std::thread objects we create in a vector named threadVector.

In more detail, this code:
1. Loops through m_ioServiceVector.  Each item in m_ioServiceVector is a std::shared_ptr (IoServicePtr is a typedef).  
2. For each io_service, call a lambda that creates a new std::thread and returns a std::shared_ptr to it (ThreadPtr is a typedef).  This lambda captures no variables, so we use the empty brackets ("[]").
3. Each thread runs its own lambda, which just calls run() on the boost::asio::io_service.  This lambda needs to capture a copy of pIOService since it runs in a new thread (potentially after the enclosing pIOService has gone out of scope), so we use "[=]".
4.  The std::back_inserter puts the ThreadPtrs created into a vector named threadVector.

How many lines of code is this?  Counting non-comment lines, it's 12. Counting semicolons, it's 3. But really this is all just a single function call to std::transform, so you could certainly argue it's just 1 line of code.  I can't wait to see what SLOC counters do with this.

Sunday, October 23, 2011

Playing with Scala

I've been learning Scala on and off when I have free time. There's a pretty good free O'Reilly book online.

I finally got far enough to figure out maven integration with eclipse, so then I ported a little TCP proxy server using Netty from Java to Scala. The result is here.

This program doesn't do anything too exciting but it forced me to learn a little Scala syntax, which I'm liking so far (type inference, pattern matching, constructors in the class definition line, lots more neat stuff I don't know yet). Now I just need to find a way to use Scala on projects at work.

Sunday, May 16, 2010

boost::asio

The more I play with boost, the more impressed I am by it. Lately I've been experimenting with boost::asio. I have a fair amount of experience with Apache MINA for Java, and after playing with boost::asio I think I've found its rough equivalent in c++.

Here is the code I've been playing with, an implementation of a TCP proxy (accept connections from 1 to N endpoints, forward them to some remote endpoint). This implementation uses asynchronous operations for all I/O socket operations, and allows for multi-threading - by default it creates a thread pool sized by the number of hardware threads the machine supports.

Saturday, March 6, 2010

Torn

At work I'm constantly researching how to do various technical things using Google. Usually I'm trying to figure out a feature of some software component that has little or no official documentation on the subject. More often than not I end up on someone's blog reading a post saying they spent hours and hours looking into whatever I'm trying to figure out, and they provide the answer.

That's one of the reasons I started this blog - I had some vague idea that when I figured out some interesting technical thing I would post about it, and maybe someday someone would find it useful. Plus there's some egotistical satisfaction in writing things when you pretend somebody might actually be interested in reading them.

But I started realizing that the things I'm researching are almost always for work purposes, and maybe it's a bad idea to share that knowledge with others. I'm not sure any one thing I find out is especially important or secret, but if over time I share lots of things I learn I'm sort of giving away what work is paying me to do.

So what to do? If everybody had this concern, most regular working people like me would never post things on their blogs and my source of technical knowledge would dry up. From that perspective I want to add to this community. But on the other hand, I don't want to give away too much.

So, I wonder: who is it who posts technical knowledge on their blogs? Is it really just people who are doing things in their free time, or am I using research others were paid to do?

Sunday, February 7, 2010

ctypes is awesome, but only for python

I've been working on some low-level message processing in C. I struggled to figure out all the rules for overlaying structs of bitfields on messages and getting the right answer on both big-endian and little-endian machines.

The first problem you run into is that little-endian swaps every other byte in a struct of all bitfields, so you have to use something like ntohs to swap bytes in 16 bit words. This isn't so bad.

The real struggle comes in dealing with bitfield ordering. On big-endian, the first bitfield in a struct starts at the MSB and works toward the LSB. On little-endian, the first bitfield starts at the LSB and works toward the MSB. So you end up with nasty #defines in structures based on your platform. I'm pretty sure I'm not crazy on this, on my box /usr/include/netinet/ip.h contains this for the definition of an IPv4 header:



/*
* Structure of an internet header, naked of options.
*/
struct ip
{
#if __BYTE_ORDER == __LITTLE_ENDIAN
unsigned int ip_hl:4; /* header length */
unsigned int ip_v:4; /* version */
#endif
#if __BYTE_ORDER == __BIG_ENDIAN
unsigned int ip_v:4; /* version */
unsigned int ip_hl:4; /* header length */
#endif


Yuck! In the structures I'm dealing with that's a lot of duplicated cryptic code.

While looking at this, I stumbled into the ctypes library in Python. From what I can tell, ctypes rocks. It lets you define structures and unions to correspond to C types. You can then overlay these structures on data you get from a file/socket/whatever and access the data field-by-field. You can also create a structure in python, assign the fields, and convert it to a raw buffer.

One of the coolest features is you can define whether you want a struct to have big-endian or little-endian behavior, and it will do the appropriate thing on whatever box you're on. If your structure inherits from ctypes.BigEndianStructure, bitfields and multiple-byte fields start at the MSB. If you need the smoking-crack-on-another-planet behavior of little endian, just make your structure inherit from ctypes.LittleEndianStructure. Here's the equivalent of the first two fields in the IPv4 header that will do the right thing on any machine that runs python:



class TestStructure(ctypes.BigEndianStructure):

_pack_ = 1

_fields_ = [
('ip_v', ctypes.c_ubyte, 4),
('ip_hl', ctypes.c_ubyte, 4)
]

t = TestStructure()
t.ip_v = 4;
t.ip_hl = 10;
s = ctypes.string_at(ctypes.addressof(t), ctypes.sizeof(t))
print ["0x%02x" % ord(x) for x in s]


$ ./test-ip.py
['0x4a']


Wow! If only all languages had features like this, the world of message processing would be a better place.

Sunday, January 10, 2010

Playing with boost::thread

Was bored today so I started playing with boost::thread. Made a little producer/consumer example with 2 threads and a thread-safe BlockingQueue, sort of a simple version of what Java provides in the standard API. I had fun using as many boost features as I could squeeze in - boost::shared_ptr, boost::posix_time, boost::bind (really cool by the way and much more powerful than std::mem_fun/std::bind1st, etc).

I'm surprised boost hasn't yet added thread-safe containers to their libraries. According to Google, lots of people seem to think they're a bad idea, even going so far as to say they were a mistake in Java. Really? Is everybody expected to reinvent the wheel here and start from scratch? Doesn't make sense to me. Here's my version of the wheel if anybody doesn't already have one.