Thchere

How to Implement Server-Side Sharded List and Watch in Kubernetes 1.36

Published: 2026-05-08 14:47:20 | Category: Cloud Computing

Introduction

As Kubernetes clusters scale to tens of thousands of nodes, controllers that watch high-cardinality resources like Pods face a fundamental scaling challenge. Each replica of a horizontally scaled controller receives the entire event stream from the API server, deserializing every object only to discard those it's not responsible for. This wastes CPU, memory, and network bandwidth. Kubernetes 1.36 introduces an alpha feature—server-side sharded list and watch (KEP-5866)—that moves filtering upstream into the API server. With this feature enabled, each controller replica tells the API server which hash range it owns, and the API server sends only the matching events. This guide walks you through enabling and using this feature in your controllers.

How to Implement Server-Side Sharded List and Watch in Kubernetes 1.36

What You Need

  • A Kubernetes cluster running v1.36 or later (alpha features must be enabled)
  • Cluster-admin permissions to enable feature gates
  • A controller that uses client-go informers to list and watch resources (e.g., custom controllers or kube-state-metrics)
  • Familiarity with Go programming and Kubernetes controller patterns
  • A deployment strategy for multiple controller replicas (e.g., StatefulSet or Deployment with consistent replica count)

Step-by-Step Guide

Step 1: Verify Kubernetes Version and Enable the Alpha Feature Gate

First, ensure your cluster's API server is running v1.36 or newer. Then enable the ShardedListWatch feature gate. Add the flag --feature-gates=ShardedListWatch=true to the kube-apiserver configuration. If using kubeadm, edit the static pod manifest or update the kubeadm configuration file. For managed services like EKS or AKS, check provider documentation for enabling alpha features. After restarting the API server, verify the feature is active by checking the API server logs for messages about shard support.

Step 2: Determine the Number of Replicas and Their Hash Ranges

Decide how many controller replicas you want. Each replica will handle a contiguous portion of the 64-bit hash space (0 to 2^64-1). Compute the start and end values for each replica. For example, with 2 replicas: Replica 0 handles [0x0000000000000000, 0x8000000000000000) and Replica 1 handles [0x8000000000000000, 0xFFFFFFFFFFFFFFFF]. For 4 replicas, split equally: each gets a quarter of the space. Store these ranges in a configuration map or compute them programmatically based on the replica index (e.g., via a StatefulSet's pod ordinal). The hash is computed using FNV-1a, so the same field (e.g., metadata.uid) always maps to the same shard.

Step 3: Modify Your Controller's Informer Setup to Include a Shard Selector

Update your controller code to inject the shard selector into the informer's ListOptions. Use the WithTweakListOptions option when creating the shared informer factory. The shard selector is a string like:

shardRange(object.metadata.uid, '0x0000000000000000', '0x8000000000000000')

You must replace the range values with those computed in Step 2. Here's an example in Go:

import (
    metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
    "k8s.io/client-go/informers"
)

shardSelector := "shardRange(object.metadata.uid, '0x0000000000000000', '0x8000000000000000')"

factory := informers.NewSharedInformerFactoryWithOptions(client, resyncPeriod,
    informers.WithTweakListOptions(func(opts *metav1.ListOptions) {
        opts.ShardSelector = shardSelector
    }),
)

If your controller uses individual informers (not a factory), set the ShardSelector field on the ListOptions passed to the NewInformer or NewFilteredSharedIndexInformer call. Currently supported field paths are object.metadata.uid and object.metadata.namespace. Choose the one that best distributes your workload.

Step 4: Deploy the Controller Replicas with the Correct Shard Selectors

Deploy your controller as a StatefulSet or Deployment with the desired number of replicas. For each replica, pass its assigned hash range via an environment variable or command-line argument. In your controller startup code, read this value and construct the shard selector string accordingly. For example, with a StatefulSet, you can use the pod's hostname to derive the replica index. Configure the deployment so that each pod knows its unique range and never overlaps with another pod's range. Ensure the number of replicas is stable to avoid coverage gaps or duplication.

Step 5: Test and Monitor the Sharded Watch Behavior

After deployment, verify that each replica receives only a subset of events. Enable verbose logging in the API server (e.g., -v=6) to see if shard filtering is applied. Check your controller's metrics: you should see a significant reduction in the number of objects processed per replica compared to a non-sharded setup. Use tools like kubectl top nodes to observe reduced CPU and memory usage on the API server. To ensure correctness, compare the sum of objects across all replicas matches the total number of objects in the cluster. Also, test that watch events are correctly filtered; for instance, create and delete Pods and confirm only the responsible replica reacts.

Tips

  • Start with a small number of replicas—2 or 4—to validate the sharding logic before scaling out further.
  • Consider using metadata.namespace if your workload is naturally partitioned by namespace. This can simplify range assignment if you have a known number of namespaces.
  • Handle rebalancing carefully: if you change the number of replicas, every pod must update its shard range simultaneously to avoid missing events. One approach is to use a rolling update that first adds new replicas with empty ranges, then redistributes.
  • Monitor for hash collisions: although FNV-1a is deterministic, very uneven distribution can occur if object UIDs are not uniformly random. Use a larger number of shards to mitigate this.
  • Performance impact: The API server performs an additional hash computation per object. This overhead is negligible compared to the savings from reduced data transfer, but benchmark in your environment.
  • Security: The shard selector is applied at the API server level; ensure RBAC permissions still enforce namespace isolation if needed.
  • Fallback plan: If the alpha feature is disabled or removed, your controller should still work without sharding (i.e., ignore the ShardSelector field).