Download raw body.
pfsync does not update dst peer's state
Hello,
there was a report on bugs [1] back in Jan. There were few
more reports off-list. Matthieu@ provided detailed information
from his pfsync boxes. He is using two peering firewalls.
He noticed there was excessive amount of pfsync traffic
exchanged between peers. Poking further we discovered
there were lot's of TCP states with peer end points in:
ESTABLISHED : SYN_SENT
FIN_WAIT_2 : SYN_SENT
* : SYN_SENT
on the secondary firewall peer. All those 'malformed' states were received
from primary firewall peer.
Looking at how pfsync is processing state update message from its peer
we soon find a function pfsync_upd_tcp() which reads as follows:
2872 static int
2873 pfsync_upd_tcp(struct pf_state *st, const struct pfsync_state_peer *src,
2874 const struct pfsync_state_peer *dst)
2875 {
2876 int sync = 0;
2877
2878 /*
2879 * The state should never go backwards except
2880 * for syn-proxy states. Neither should the
2881 * sequence window slide backwards.
2882 */
2883 if ((st->src.state > src->state &&
2884 (st->src.state < PF_TCPS_PROXY_SRC ||
2885 src->state >= PF_TCPS_PROXY_SRC)) ||
2886
2887 (st->src.state == src->state &&
2888 SEQ_GT(st->src.seqlo, ntohl(src->seqlo))))
2889 sync++;
2890 else
2891 pf_state_peer_ntoh(src, &st->src);
2892
2893 if ((st->dst.state > dst->state) ||
2894
2895 (st->dst.state >= TCPS_SYN_SENT &&
2896 SEQ_GT(st->dst.seqlo, ntohl(dst->seqlo))))
2897 sync++;
2898 else
2899 pf_state_peer_ntoh(dst, &st->dst);
2900
2901 return (sync);
2902 }
lines 2883 - 2891 handle source TCP peer in state. lines 2893 - 2899
update TCP state for destination peer which is the one stuck in SYN_SENT.
The condition at line 2895 looks odd. It should be the same check
we do for source connection at line 2887.
To see what's going on there we need to better understand a context here.
the st->src is TCP stage for client we keep on secondary firewall, 'src'
comes from update we got from primary firewall peer.
st->dst is TCP state for server side connection we keep on secondary firewall,
'dst' comes from firewall peer.
The condition here should just prevent peering firewall (secondary)
firewall to move state backwards (dst.state > dst->state),
We also don't want to update state with old sequence number
(SEQ_GT(dst.seqlo, dst->seqlo). In other cases we want to call
pf_state_peer_ntoh() to update state.
The check 'dst.state >= TCPS_SYN_SENT' is wrong here as it prevents
pfsync to call pf_state_peer_ntoh() which updates the state.
The check should be in fact same as we have for src peer:
st->dst.state == dst->state
we do check sequence numbers if there is a tie on state.
the onliner tweak is bellow. I still don't know the exact
mechanism how pfsync-update storm gets triggered. fixing
pfsync_upd_tcp() seems to fix the cause of update storm.
OK to commit ?
thanks and
regards
sashan
[1] https://marc.info/?l=openbsd-bugs&m=170661373220327&w=2
--------8<---------------8<---------------8<------------------8<--------
diff --git a/sys/net/if_pfsync.c b/sys/net/if_pfsync.c
index 4ad51489509..0a3fcbfb9f6 100644
--- a/sys/net/if_pfsync.c
+++ b/sys/net/if_pfsync.c
@@ -2892,7 +2892,7 @@ pfsync_upd_tcp(struct pf_state *st, const struct pfsync_state_peer *src,
if ((st->dst.state > dst->state) ||
- (st->dst.state >= TCPS_SYN_SENT &&
+ (st->dst.state == dst->state &&
SEQ_GT(st->dst.seqlo, ntohl(dst->seqlo))))
sync++;
else
pfsync does not update dst peer's state