Providing the benefits of both without the downsides :)
(ABI breaks or external dependencies)
For this Anthonys rred is equipped with:
- magic-filename-pickup of patches rather than explicit messages
- use of FileFd instead of FILE* to get on-the-fly uncompress
of the gzip compressed pdiff patches
The acquire code in turn stops checking for apt-file's helper
as our own rred is now clever enough for our needs.
Based on the idea presented in:
It reads all patches one by one and merges them in-memory before
applying the merged changes to the index.
Beware: This commit by David Kalnischkies rips out the rred binary
rewrite unchanged (expect minor format issue corrections) from the
proposed changes, so this commit alone BREAKS pdiff completely.
The integration into the acquire system as it was prepared in the
previous POC will be done in the next commit to have proper 'blame'.
In 51fc6def77 we limited the amount of
pdiff to be downloaded per index to 20. This was a compromise between
not letting it go overboard (becoming even slower) and not using
bandwidth needlessly. Now that with the POC the speed reason is gone it
makes sense again to download as much files as we possible can via pdiff
to save bandwidth (and possibly even time).
It also avoids problems with the limit in cases we were we deal with a
server merged archieve as this limit assumes a strict patch progression.
The idea of pdiffs is to avoid downloading the hole file by patching the
existing index. This works very well, but becomes slow if a lot of
patches needs to be applied to reconstruct an up-to-date index and in
recent years more and more dinstall (or similar) runs are executed
creating more and more pdiffs in the same amount of time, so pdiffs
became less useful.
The solution is simple: Reduce the amount of patches (which are very
small) which need to be applied on top of the index we have available
(which is usually pretty big).
This can be done in two ways: Either merge the patches on the
server-side so that the client has to download only one patch or the
patches are all downloaded and merged on the client-side.
The first needs a client who is doing one step at a time who can also
skip patches if it needs (APT supports this for a long time now).
The later is implemented by this commit, but depends on the server NOT
merging the patches and the patches being in a strict order in which no
patch is skipped.
This is traditionally the case for dak, but other repository creators
support merging – e.g. reprepro (which helpfully adds a flag indicating
that the patches are merged). To support both or even mixes a client
needs more information which isn't available for now.
This POC uses the external diffindex-rred included in apt-file to
do the heavy lifting of merging & applying all patches in one pass,
hence to test this feature apt-file needs to be installed.