Understanding and modifying the default diff for commits

Pierre-Yves David pierre-yves.david at ens-lyon.org
Wed Jun 10 10:04:56 UTC 2020


Your use case is a bit unclear to me. I'll ask some silly questions to 
try to clarify that.

As far as I understand, you are using KNIME to efit a "project" KNIME 
itself generate a bunch of XML file. Whenever anything happens, KNIME 
rewrite all files to update the author and last update file, regardless 
of them being affected by the update or not. Right ?

I assume these xml file contains actual data, right ? They are not 
generate by product you could exclude, right ?

The simplest option would probably to run a small script that revert 
these before commit, or that simply delete these field as another user 
suggested. You could configure it as a pre-status and pre-commit hook.

An interesting feature Mercurial has, is fileset `hg help fileset`. It 
would probably be simple to create a fileset that detect your case and 
use it in situation where it helps.

The best solution for your would probably to teach KNIME to not do this, 
but I guess you have tried that already.

Cheers,

On 6/8/20 7:32 PM, Andres Sommerhoff wrote:
> Hi all, I want to intervene the diff operation used by the mercurial 
> commit. I want to collect only the meaningful changes a heavy directory 
> tree full of XML files (this to make easier to audit what really has 
> changed, but also saving some disk space by doing so doesn’t hurt). I 
> was looking in internet and some mercurial add that could help, put I 
> was unsuccessfully, so thinking to make my own extension (or maybe some 
> scripting in pre-commit hook).
> 
> I will appreciate any advice where to start my intervention of the diff 
> process during the commit of mercurial if I go to make my own extension? 
> Any help for locating the diff code that is used by mercurial (to look 
> and learn how is the interaction with it)?
> 
> If you are curious about the problem I’m trying to deal with it, it is 
> the software KNIME that the projects (scientific models) developed in 
> that software is saved in several XML files, where each XML represent a 
> small portion of the model (“nodes” as called in KNIME). One project can 
> easily have more than 500 nodes (-> XML files). If I change a single 
> node and save the project, then not only the single related file is 
> changes but all the 500 XML files are also updated. Inside each XML file 
> the “last modification date” and “last author” is changed.
> 
> I’m looking to skip all the files that the single change was updating 
> “last modification date” and “last author” but nothing else. By doing 
> so, I can focus in the important changes, making easy to audit the 
> manful modifications, merges can be far less cumbersome, and the history 
> much cleaner when making a log on a specific file.
> 
> Maybe a simple command line option for the commit is the solution, but I 
> see no official option in the commit command in order to use an 
> alternative diff tool for calculating the patches. On the other hand, as 
> far I read, the “extdiff” option only affect the comparison of the 
> revisions, but not for commit process (maybe I’m wrong on this last 
> sentence). Or maybe a commit command line using the function 
> “diff([includepattern [, excludepattern]]):” in conjunction with 
> “--exclude “ will make the magic I’m looking for, but I couldn’t figure 
> it out yet.
> 
> I’m on Windows 10, using Mercurial and TortoiseHg 5.0.2.
> 
> Regards, Andres
> 
> _______________________________________________
> Mercurial mailing list
> Mercurial at mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial
> 

-- 
Pierre-Yves David



More information about the Mercurial mailing list