Background

In data center networks, Priority Flow Control (PFC) is a mechanism defined in IEEE 802.1Qbb that allows a receiver to pause traffic on a per-priority basis, enabling lossless transport for high-priority traffic such as RDMA over Converged Ethernet (RoCE).

When a switch’s ingress buffer starts filling up, it sends a PFC PAUSE frame to the upstream sender. However, there is always some in-flight traffic between the time the PAUSE is sent and when the sender actually stops — this in-flight data must be absorbed by the headroom buffer.

If the headroom buffer is too small, the switch will drop packets even though PFC is enabled, defeating the purpose of lossless transport. Headroom testing verifies that the DUT’s buffer configuration is sufficient to handle this scenario without any packet loss.


Test Cases

The SONiC project includes two headroom test cases in tests/snappi_tests/pfc/test_pfc_pause_response_with_snappi.py:

Test Case Description
test_pfc_single_lossless_headroom Verify headroom buffer capacity for a single lossless priority under PFC pause
test_pfc_pause_multi_lossless_headroom Verify headroom buffer capacity for multiple lossless priorities under PFC pause

Testbed

The test uses a standard tgen topology (Traffic Generator connected to DUT):

( T T S x G n E a P N p o p r i t ) P F I C n S g O P r N A e i U s C S D s E U S T P w f o i r r t a t c m h e s R T x G E P N o r t
  • DUT: SONiC switch (e.g. Dell S6100, Cisco 8000 series)
  • Traffic Generator: Ixia / Keysight using Snappi (OTG API)
  • Topology marker: pytest.mark.topology('tgen')

Key fixtures used:

  • snappi_api — OTG session handle
  • snappi_testbed_config — testbed port configuration
  • enum_dut_lossless_prio — lossless priority under test (e.g. priority 3)
  • enum_pfc_pause_delay_test_params — PFC delay value and expected result, e.g. "tag|200|True"

Traffic Model

The test constructs two types of flows:

Test Flow (lossless priority)

  • Direction: TGEN Tx → DUT ingress port
  • Priority: the lossless priority under test (e.g. CoS 3, DSCP 26)
  • Rate: line rate
  • Purpose: fill the DUT’s ingress buffer and trigger PFC

Background Flows (all other priorities)

  • Direction: TGEN Tx → DUT ingress port
  • Priority: all priorities except the test priority
  • Purpose: simulate realistic mixed traffic

PFC PAUSE Flow

  • Direction: TGEN Rx port → DUT egress port (reverse direction)
  • Frame type: IEEE 802.1Qbb PFC PAUSE frame
  • Pause quanta: configurable via pfc_pause_delay parameter
  • Class Enable Vector: set for the lossless priority under test
  • Purpose: pause the DUT’s egress, causing ingress buffer to build up
p # # # f c 1 A 2 _ t 0 p q 0 a u 1 u a 0 q s n 0 u e t G a _ a : n d t e = 1 a l a 5 q y 1 u 2 a 1 = n . b t 0 2 i a 2 0 t 4 0 t µ i 5 s m . # e 1 p s 2 a q u u n s a s e n t d a u r v a a t l i u o e n p e r f r a m e

Test Method

The core logic is driven by run_pfc_test() in tests/snappi_tests/pfc/files/helper.py.

Step-by-step flow:

1. Configure OTG (Open Traffic Generator)

run_pfc_test(
    api=snappi_api,
    ...
    global_pause=False,           # per-priority PFC, not global pause
    pause_prio_list=[lossless_prio],
    test_prio_list=[lossless_prio],
    bg_prio_list=bg_prio_list,
    test_traffic_pause=True,      # expect traffic to be paused
    snappi_extra_params=snappi_extra_params
)

2. Set headroom test params

headroom_test_params = [pfc_pause_delay, headroom_test_result]
# pfc_pause_delay: how long to pause (in quanta)
# headroom_test_result: True = expect zero loss, False = expect loss
snappi_extra_params.headroom_test_params = headroom_test_params

3. Traffic sequence

  1. Start test traffic (lossless priority) at line rate
  2. After buffer fills, inject PFC PAUSE frames from Rx side
  3. DUT receives PAUSE, stops forwarding on that priority
  4. In-flight packets must be absorbed by headroom buffer
  5. After pause expires, traffic resumes

4. Result check

  • If headroom_test_result = True: assert zero packet loss on the lossless priority
  • If headroom_test_result = False: assert that packet loss does occur (buffer overflow expected)

Test Result Interpretation

Scenario pfc_pause_delay Expected Result Meaning
Small delay e.g. 100 True (no loss) In-flight packets fit within headroom
Large delay e.g. 2000 False (loss) In-flight exceeds headroom, drops expected

A typical passing result looks like:

T T L P R e e o F E s s s C S t t s U : P L f f A T l l 0 U : o o S w w p E P a A T R c f S x x k r S : : e a t m 1 1 s e , , s 0 0 ( 0 0 0 s 0 0 . e , , 0 n 0 0 % t 0 0 ) : 0 0 5 p p 0 a a 0 c c 0 k k e e t t s s

Packet Capture Analysis

During the test, you can capture at the DUT ingress/egress ports to observe the behavior:

PFC PAUSE frame structure (IEEE 802.1Qbb)

E M t A h D S E C O C P e s r t p l a r t c h C c a u n e o o s s e M M r n d s e t A A T t e C C y r : E Q H : : p o n u e e l 0 a a a 0 < : : x b n d 1 D 0 l t e : U 0 1 e a r 8 T x 0 [ : 0 8 1 V 0 : p 8 e - c o 0 ( c 7 2 r 8 P t ] : t F o : 0 ( C r 0 M M ) : [ : A A 0 0 C C 0 , 0 > x : C 0 0 0 o 0 , 1 n 0 t 8 0 r , ( o P l ( 2 F ) b 0 C i 0 t , m u 3 0 l , t s i e 0 c t , a s = 0 t , p a r 0 d i ] d o r r e i s t s y ) 3 )

Key observations from capture

  1. Before PFC: test flow Tx and Rx counters increment at line rate
  2. PFC PAUSE received by sender: Tx stops, Rx drops to 0 — in-flight packets still in pipe
  3. Headroom absorption: these in-flight packets arrive at DUT and are buffered in headroom
  4. After pause expires: traffic resumes, all buffered packets eventually forwarded
  5. Zero loss: all packets accounted for — headroom was sufficient

If headroom is insufficient, you will see:

  • Rx counter falls behind Tx counter during pause period
  • BUFFER_POOL_WATERMARK on DUT shows headroom exhaustion
  • Packets dropped at ingress, not recoverable

DUT-side verification

# Check PFC counters on DUT
show pfc counters

# Check buffer watermarks
show buffer configuration interface Ethernet0
show buffer information

# Check queue drop counters
show queue counters Ethernet0

Multi-Priority Headroom Test

test_pfc_pause_multi_lossless_headroom extends the single-priority case:

  • All lossless priorities (typically priority 3 and 4) are paused simultaneously
  • Background traffic uses all lossy priorities
  • Headroom must be sufficient for all lossless priorities combined

This is a stricter test — total in-flight traffic is larger, putting more pressure on shared headroom buffers.

pause_prio_list = lossless_prio_list   # e.g. [3, 4]
test_prio_list = lossless_prio_list
bg_prio_list = lossy_prio_list         # e.g. [0, 1, 2, 5, 6, 7]

Summary

PFC headroom testing is one of the most critical validations for any lossless network deployment. The SONiC snappi test suite provides a clean, automated way to verify headroom sufficiency using OTG APIs, with parameterized delay values that map directly to real-world cable latency and propagation delay scenarios.

Key takeaways:

  • Headroom = buffer to absorb in-flight packets after PAUSE is sent
  • Insufficient headroom → packet loss even with PFC enabled
  • Test parameterization (pfc_pause_delay) simulates different link distances and RTT scenarios
  • Both single-priority and multi-priority scenarios must be validated

Source: SONiC sonic-mgmttests/snappi_tests/pfc/test_pfc_pause_response_with_snappi.py