PWP is the TCP-level protocol two BitTorrent peers use to exchange pieces once a connection is established. It sits after the tracker (which hands you a list of peers) and before whatever piece-assembly / disk-write logic you have. Everything here is binary, big-endian, over a raw TCP stream.
Before any messages flow, both sides exchange a handshake. It is not framed like regular messages - it has its own fixed layout.
<pstrlen><pstr><reserved><info_hash><peer_id>pstrlen = 19 (always)pstr = "BitTorrent protocol" (literal ASCII)reserved = 8 zero bytes. extension bits, leave zeroed unless you implement extensionsinfo_hash = SHA-1 of the bencoded info dict, 20 bytespeer_id = your client's self-assigned 20-byte IDTotal: 68 bytes, fixed.
Send yours, read theirs. Validate: pstr must match, info_hash must match what you expect. If either fails, drop the connection.
All messages after the handshake share one wire format:
<length prefix><message ID><payload>length = bytes that follow = 1 (id) + len(payload).
Parse loop:
read 4 bytes -> decode uint32 -> length
if length == 0 -> keep-alive, stop
read length bytes -> id = first byte, payload = rest
dispatch on id
Two reads per message, no delimiters.
| ID | Name | Payload |
|---|---|---|
| - | keep-alive | none |
| 0 | choke | none |
| 1 | unchoke | none |
| 2 | interested | none |
| 3 | not interested | none |
| 4 | have | piece index |
| 5 | bitfield | bitfield bytes |
| 6 | request | index, begin, length |
| 7 | piece | index, begin, block |
| 8 | cancel | index, begin, length |
<len=0x00000000>
Zero-length frame. No id byte, no payload. Peers send these periodically to hold the TCP connection open. On receive: do nothing, reset your timeout counter. If nothing arrives (including keep-alives) for ~2 min, close the connection.
<len=0x00000001><id>No payload. Four messages, same format.
<len=0x00000005><id=0x04><piece index>Sent after you successfully download and verify a piece. Tells the peer you now have it.
<len=0x00000001+X><id=0x05><bitfield>Sent right after the handshake. Declares which pieces the sender already has. One bit per piece, MSB first. Bit i is in byte floor(i/8), at position 7 - (i % 8). Trailing bits (if num_pieces is not a multiple of 8) are zero. X = ceil(num_pieces / 8).
Only sent if you have at least one piece. Optional - a peer with nothing won't send it.
<len=0x0000000D><id=0x06><index><begin><length>index - zero-based piece indexbegin - byte offset within the piecelength - block size. conventionally 16 KiB (16 * 1024). last block of a piece may be smallerYou can only send this when the peer has unchoked you (peer_choking == false).
<len=0x00000009+X><id=0x07><index><begin><block>Response to a request. block is the raw bytes. X = block length.
<len=0x0000000D><id=0x08><index><begin><length>Same payload as request. Cancels an in-flight block request. Used in end-game mode when the same block has been requested from multiple peers and one already responded.
read_message(stream):
len_buf = read_exact(stream, 4)
length = decode_uint32_be(len_buf)
if length == 0:
return KeepAlive
msg_buf = read_exact(stream, length)
id = msg_buf[0]
payload = msg_buf[1:]
switch id:
0 -> Choke
1 -> Unchoke
2 -> Interested
3 -> NotInterested
4 -> Have { index = uint32_be(payload[0:4]) }
5 -> Bitfield { bits = payload }
6 -> Request { index, begin, length = three uint32_be }
7 -> Piece { index, begin = uint32_be; block = payload[8:] }
8 -> Cancel { index, begin, length = three uint32_be }
_ -> error: unknown id
read_exact must loop until the full byte count is received - TCP does not guarantee a single read returns everything requested.
serialize(msg):
payload = msg.payload() // empty slice for no-payload messages
length = 1 + len(payload) // id byte + payload
buf = uint32_be(length) + [msg.id()] + payload
return buf
serialize_keepalive():
return uint32_be(0) // 4 zero bytes, no id
Every message type exposes id() and payload(). Serialization is uniform - no special cases except keep-alive.
Handshake is serialized separately (fixed layout, not length-prefixed):
serialize_handshake(info_hash, peer_id):
buf = [19] + "BitTorrent protocol" + [0]*8 + info_hash + peer_id
return buf // 68 bytesclient peer
| |
|-------- handshake -------------->|
|<------- handshake ---------------|
| |
|<------- bitfield (5) ------------| peer declares what it has
|<------- keep-alive --------------| may arrive between any messages
| |
|-------- interested (2) --------->| we want pieces from them
| |
|<------- unchoke (1) -------------| peer allows us to request
| |
|-------- request (6) ------------>| ask for a block
|<------- piece (7) ---------------| receive block data
|-------- request (6) ------------>|
|<------- piece (7) ---------------|
| ... |
|-------- have (4) --------------->| tell peer we completed a piece
State per connection (both directions):
am_choking = true // we are choking the peer
am_interested = false // we are not interested in the peer
peer_choking = true // peer is choking us
peer_interested = false // peer is not interested in us
Default on connect: both sides choking, neither interested. Update these as the corresponding messages arrive. Gate request sends on peer_choking == false.
Pieces (from torrent metadata) are large - 256 KiB, 512 KiB, sometimes more. request and piece messages work at block granularity:
min(16384, remaining_bytes_in_piece)Reassemble received blocks into the full piece, then SHA-1 verify against the hash in the torrent metadata before writing to disk.
Two reads per message - read the 4-byte length prefix first, then read exactly length bytes for the rest. Use a buffered reader on the raw connection to avoid a syscall per byte.
Block length clamping - block_length(piece_remaining) returns 16384 unless piece_remaining < 16384, then returns piece_remaining. Handles tail blocks without special-casing the request loop.
Peer state as four booleans - am_choking, am_interested, peer_choking, peer_interested. Keep these in sync with arriving messages. Everything else (piece selection, queueing) reads from them.