Skip to content

fetch: add --must-have and remote.*.mustHave#2085

Open
derrickstolee wants to merge 4 commits intogitgitgadget:masterfrom
derrickstolee:must-have
Open

fetch: add --must-have and remote.*.mustHave#2085
derrickstolee wants to merge 4 commits intogitgitgadget:masterfrom
derrickstolee:must-have

Conversation

@derrickstolee
Copy link
Copy Markdown

@derrickstolee derrickstolee commented Apr 7, 2026

Fetch negotiation aims to find enough information from haves and wants such that the server can be reasonably confident that it will send all necessary objects and not too many "extra" objects that the client already has. However, this can break down if there are too many references, since Git truncates the list of haves based on a few factors (a 256 count limit or the server sending an ACK at the right time).

We already have the --negotiation-tip feature to focus the set of references that are used in negotiation, but I feel like this is designed backwards. I'd rather that we have a way to say "this is an important set of refs, but feel free to add more refs if needed" than "only use these refs for negotiation".

Here's an example that demonstrates the problem. In an internal monorepo, developers work off of the 'main' branch so there are thousands of user branches that each add a few commits different from the 'main' branch. However, there is also a long-lived 'release' branch. This branch has a first-parent history that is parallel to 'main' and each of those commits is a merge whose second parent is a commit from 'main' that had a successful CI run. There are additional changes in the 'release' branch merge commits that add some changelog data, so there is a nontrivial set of novel blob content in that branch and not just a different set of commits.

The problem we had was that our georeplication system was regularly fetching from the origin and trying to get all data from all reachable branches. When the 'release' branch updated, the client would run out of haves before advertising its copy of the 'release' branch, but it would still list the new 'release' tip as a want. The server would then think that the client had never fetched that branch before and would send all of the changelog data from the whole history of the repo. (This led to a lot of downstream problems; we mitigated by setting a refspec that stopped fetching the 'release' branch, but this is not ideal.)

What I'd like is a mechanism to say "always advertise the client's version of 'main' and 'release' but also opportunistically include some user branches".

Based on my understanding, the '--negotiation-tip' option is close but not quite what I want. I could have the client only advertise 'release' and 'main' and never advertise any user branches. But then we'd download all content from each user branch every time it updates. Perhaps this would happen even with opportunistic inclusion of more haves, but I'd like to explore this area more.

There's also an issue that the '--negotiation-tip' feature doesn't seem to have a config key that enables it without CLI arguments. This is something that we could consider independently.

This patch series adds a new '--must-have' argument in the same places that '--negotiation-tip' exists. This adds a set of references that will always be included in the negotiation as haves, but then the rest of the negotiation can proceed as normal. It also adds a new 'remote.<name>.mustHave' config option to enable this behavior without CLI arguments.

The series is organized into four changes:

  1. The first change updates a test that started failing after I added some tests before it. This fix should be valuable on its own, but I need it for my tests to pass.
  2. The second change adds the --must-have option for fetch and tests it relative to --negotiation-tip.
  3. The third change adds the config option to enable --must-have by default. Any use in the CLI completely overrides the config option.
  4. The fourth change adds capabilities for the config to update negotiation during git push.

I've CC'd some folks who have been working on or near the --negotiation-tip feature for feedback as to how these things fit together in the bigger picture.

Thanks,
-Stolee

cc: gitster@pobox.com
cc: jonathantanmy@google.com
cc: chooglen@google.com
cc: ps@pks.im

The 'fetch follows tags by default' test sorts using 'sort -k 4', but
for-each-ref output only has 3 columns. This relies on sort treating
records with fewer fields as having an empty fourth field, which may
produce unstable results depending on locale. Use 'sort -k 3' to match
the actual number of columns in the output.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Add a --must-have option to git fetch that specifies ref patterns whose
tips should always be sent as "have" commits during negotiation,
regardless of what the negotiation algorithm selects.

Each value is either an exact ref name (e.g. refs/heads/release) or a
glob pattern (e.g. refs/heads/release/*). The pattern syntax is the same
as for --negotiation-tip.

This is useful when certain references are important for negotiation
efficiency but might be skipped by the negotiation algorithm or excluded
by --negotiation-tip. Unlike --negotiation-tip which restricts the have
set, --must-have is additive: the negotiation algorithm still runs and
advertises its own selected commits, but the refs matching --must-have
are sent unconditionally on top of those.

If --negotiation-tip is used, the have set is first restricted by that
option and then increased to include the tips specified by --must-have.

Due to the comparision with --negotiation-tip, a previously untranslated
warning around --negotiation-tip is converted into a translatable string
with a swap for which option that is relevant.

Getting this functionality to work requires moving these options through
the transport API layer.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Add a new multi-valued config option remote.<name>.mustHave that
specifies ref patterns whose tips should always be sent as "have"
commits during fetch negotiation with that remote.

Parse the option in handle_config() following the same pattern as
remote.<name>.serverOption. Store the values in a string_list on struct
remote so they are available per-remote.

In builtin/fetch.c, when no --must-have options are given on the command
line, use the remote.<name>.mustHave config values as the default. If
the user explicitly provides --must-have on the CLI, the config is not
used, giving CLI precedence.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
When push.negotiate is enabled, send-pack spawns a 'git fetch
--negotiate-only' subprocess to discover common commits. Previously
this subprocess had no way to include must-have refs in the
negotiation.

Add a must_have field to send_pack_args, set it from the transport
layer where the remote struct is available, and pass explicit
--must-have arguments to the negotiation subprocess. This approach
directly passes the resolved config values rather than relying on the
subprocess to read remote config, which is more robust when the URL
alone is used as the remote identifier.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
@derrickstolee derrickstolee changed the title fetch: add --must-have and remote.name.mustHave fetch: add --must-have and remote.*.mustHave Apr 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant