doc/wireshark-dissection-and-reassembly.md


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134

# Wireshark dissection and reassembly
Wireshark's current dissection engine and stream reassembly functionality has
been the same for a long time, but it is showing its age. This document
describes the current implementation (Wireshark 3.2.x), related research, and
attempts to provide a solution for identifies problems.

Status: **DRAFT**.

## Overview
The primary unit of work is a frame, sometimes referred to as packet. These are
passed to the frame dissector which will:

- Add metadata such as timing.
- Pass the buffer to the next dissector. The dissector is usually Ethernet or
  IP, depending on how the capture file was created.
- Once done, any post-dissectors will be invoked with the same buffer.

The "next dissector" above will typically parse some data, and pass the
remaining data to the next. This is the case for Ethernet -> IPv4/IPv6 -> TCP
for example. All of these are currently done serially, the next packet cannot be
processed until the current one is finished. One reason is that the dissection
of subsequent packets may depend on previous ones. This limits parallel
processing, something which is also made difficult due to implementation details
such as use of global data.

Aside from per-packet processing, dissectors may maintain state:

- The TCP dissector reconstructs flows, performing reassembly of TCP segments.
- The TLS dissector reconstructs a TLS handshake and uses the information to
  build a cipher for decrypting application data. This decrypted application
  data is remembered for later use.
- The DNS dissector remembers message identifiers to find retransmissions and to
  calculate response times.
- The WireGuard dissector processes handshake messages and creates a cipher for
  a session. Decrypted data is not saved due to memory usage concerns, instead
  decryption is performed every time the packet is accessed. This is possible
  because a single packet contains the counter value required for decryption.
  The TLS dissector on the other hand cannot read the counter from a TLS record.

Reliable TCP stream reassembly is required for proper functionality of
higher-level protocols. Typically, the initial part of a higher-level PDU (such
as the start of HTTP/1.1 headers) are aligned with a TCP segment payload. If all
headers fit in a single TCP segment, then the HTTP dissector is able to dissect
the full headers without further state. However, if the HTTP request is split
over multiple segments, then these segments have to be collected and merged
based on their sequence numbers. This introduces its own share of problems:

- TCP segments may be overlapping.
- TCP segments may appear out of order. Out-of-order SYN or (more likely) FIN
  may result in wrongly reconstructed streams
  ([Bug 16289](https://bugs.wireshark.org/bugzilla/show_bug.cgi?id=16289)).
- TCP segments may be missing from the capture file.
- TCP segments may be duplicated due to retransmission.
- TCP segments may be overlapping, and contain conflicting data. Either due to
  bitflips or malicious actors in a network.
- The packet capture could start in midst of a sessions. If multiple HTTP
  messages are sent over one stream, the start of a TCP segment may not
  coincidence with the start of a HTTP message. That means that the stream
  cannot be recovered from a naive assumption.

Assuming a mechanism that properly reassembles the above complete TCP stream
into a sequential stream, the higher-level protocols may bring additional
problems. Consider TLS:

- TLS records can be split over multiple TCP segments.
- Multiple TLS records may be present in one TCP segment.
- The start of a TLS record may not coincidence with the start of a TCP segment.
- Decrypted application data may not be uniquely identifiable by the frame
  number (the position of a packet in the capture file).

And after TLS, the next application data protocol may also bring additional
problems. Consider HTTP/2:

- HTTP/2 multiplexes a TCP/TLS stream into multiple logical streams which are
  contained in HTTP/2 frames.
- A single TLS record might contain multiple HTTP/2 stream frames which are
  identified by a 31-bit Stream Identifier.
- HTTP/2 stream frames may be split over multiple TLS records.
- The frame number may not uniquely identify a HTTP/2 frame.

Finally, all of the previous network protocols may not be useful to the
end-user. They may be more interested in data such as reconstructed HTML, CSS,
JavaScript, JSON, JPEG, etc. files. In those cases, they may not be interested
in the exact TCP segment. On the other hand, the start of a TCP segment, a TLS
record, or a HTTP/2 frame may be interesting for performance measurements. For
that to happen, precise tracking of the individual protocol data parts may be
necessary. This may be complicated by out-of-order receipt of TCP segments,
especially when multiple PDUs are in flight.

Wireshark has features to handle aggregates of individual packets:

- "Follow TCP Stream" reads through a whole capture and extracts a single TCP
  stream.
- "Export Objects" may be used to extract HTTP objects (HTML, CSS, etc.), IMF
  (email data from SMTP), etc.
- A Follow HTTP/2 Stream is available since Wireshark 3.2, but merges data from
  other streams in the reassembled packet
  ([Bug 16093](https://bugs.wireshark.org/bugzilla/show_bug.cgi?id=16093)).

The state tracking required for the above functionality requires resources,
trading off memory cost against CPU time. With new protocols such as QUIC and
HTTP/3, the complexity of decryption, providing stream reassembly and accurate
metadata such as timing seem to warrant significant dissection engine changes in
order to simplify the implementation of new features.

Large objects such as Docker image layers and videos also require more efficient
implementations:

- Memoization to speed up reassembly.
- Reduce memory usage by sharing buffers where possible.
- Consider folding or eliding fields. For example, a large object of hundreds of
  megabytes likely consists of several 100k TCP segments, displaying all of
  these in a single view is impossible.

## Ideas
To speed up processing, parallelism is needed. In the common case with no
malicious packets, packet processing should be postponed until flow
reconstruction has happened.


## Related work
This section covers other works from which lessons can potentially be learned.

### tcpflow
Passive TCP Reconstruction and Forensic Analysis with tcpflow, 2013-09
https://calhoun.nps.edu/bitstream/handle/10945/36026/NPS-CS-13-003.pdf

https://github.com/simsong/tcpflow

### binpac
binpac: A yacc for Writing Application Protocol Parsers, 2006-10
https://www.icsi.berkeley.edu/pubs/networking/binpacIMC06.pdf

https://github.com/zeek/binpac