Drivers and 'the wire'

From the perspective of the driver developer a MongoDB driver:

  • Marshalls data from the language's native types into the format the MongoDB server requires (== a BSON payload proceeded by a few classically simple network fields in each packet's header area).
  • Sends and receives that info, keeping track of which reply from a server matches which request.
    • When using a connection pool it also keeps a track of which thread the requests were sent from
  • Goes to error handling when there are network interruptions like abrupt socket closure and other TCP casuality situations
  • Implements a lot of detail in the 'Meta' API driver specification for server discovery and monitoring ("SDAM") and server selection.

From the application developer's perspective the MongoDB driver:

  • Presents the database as an object you can push data in and pull data out of.
  • The API provided is idiomatic for your language. E.g. where Java programmers run a find() method on a collection object, C driver users run a mongoc_collection_find() function that takes a mongoc_collection_t* pointer argument, etc.

To look at it from another side this is what the driver API doesn't do:

  • Involve the application programmer in maintaining the TCP socket connections.
  • Involve the application programmer in determining which remote servers are the current primaries (i.e. the one that the writes happen on first)
  • Expose network packet data in the wire protocol format

Apart from the fact that you open a connection, and there can be exceptions thrown when a server crashes or the network is disconnected, there is limited expression in the API that the database is on a remote server. There are no network-conscious concepts the user must engage with such as 'queue this request', 'pop reply off incoming message stack', etc.

Many drivers; one Wire Protocol

Regardless of which driver you are using, at the Wire Protocol layer they are all the same fundamentally. If they are contemporary versions there's a good chance the BSON payload in each Wire protocol packet is identical excluding ephemeral fields like a timestamps.

The format of data in MongoDB Wire Protocol requests and responses is relatively simple, but it is a binary one and is far from being human-readable. The below comes from TCP payloads captured using tcpdump, manually unwrapped using command line tools od and bsondump according to the info in the MongoDB wire protocol documentation.

Example find in various APIs MongoDB wire packet mongod code
mongo shell db.foo.find({"x": 99};
PyMongo db.foo.find({"x": 99})
Java db.getCollection("foo").find(eq("x", 99))
PHP $db->foo->find(['x' => 99]);
Ruby client[:foo].find(x: 99)
OP_MSG
length=180;requestID=0x1b73a9;responseTo=0;opCode=2013(=OP_MSG type)
flags=0x00.0x00
section 1/1 = {
  "find":"foo",
  "filter":{"x":99.0},
  "$clusterTime":{ ... }},
  "signature":{ ... },
  "$db":"test"
}
mongo::FindCmd::run
(A cursor object with first batch results) OP_MSG (as a reply)
length=180;requestID=0xb5a;responseTo=0x1b73a9;opCode=2013(=OP_MSG type)
flags=0x00.0x00
section 1/1 = {
  "cursor":{
    "id":{"$numberLong":"0"},
    "ns":"test.foo",
    "firstBatch":[
    {"_id":ObjectId("5b3433ad88d64ee7afb5dc80"), "x":99.0,"order_cust_id":"AF4R2109"}
  ]
  },
  "ok":1.0,
  "operationTime":{ ... },
  "$clusterTime":{ ... },
  "signature":{ ... },
  "keyId":{"$numberLong":"0"}}}
}

OP_QUERY and early generations

An optional detour for those who knew the original Wire protocol messages (OP_QUERY, OP_INSERT, etc.) and are interested in what traffic looked like with these.

Expand me...

Database command type

You might have noticed that there's no primary / headlined / specially labeled value in the BSON command object that indicates what sort of command the client is sending.

You might be wondering 'Does the server run through a list of key-value pairs in fixed order until it gets a match?' (E.g. if (commandMessage.hasKey("find") then --> FindCmd:run(), else if commandMessage.hasKey("update") -> UpdateCmd::run(), etc. ....?).

Nope, a simpler mechanism is used. From util/net/op_msg.h:

    StringData getCommandName() const {
        return body.firstElementFieldName();
    }

Take the key name from the first key-value pair. End of function.

A lesson from this is that order in BSON can matter (at least to MongoDB). Important for driver developers, but not application programmers as the driver API will take care of this point for you.

What it looks like to the programmer

I don't want to re-invent the documentation wheel for this part. MongoDB's official documentation tutorials are good and cover many language samples in one page. Some links for a couple of types of operations: