Preamble
I’m using git as a version control system for a paper that my lab is writing, in LaTeX. There are several people collaborating.
I’m running into git being stubborn about how it merges. Let’s say two people have made single-word changes to a line, and then attempt to merge them. Though git diff –word-diff seems capable of SHOWING the difference between the branches word-by-word, git merge seems unable to perform the merge word-by-word, and instead requires a manual merge.
With a LaTeX document this is particularly annoying, as the common habit when writing LaTeX is to write a full paragraph per line and just let your text editor handle word wrapping when displaying for you. We are working around for now by adding a newline for each sentence, so that git can at least merge changes on different sentences within a paragraph. But it will still get confused about multiple changes within a sentence, and this makes the text no longer wrap nicely of course.
The Question
Is there a way to git merge two files “word by word” rather than “line by line”?
Here’s a solution in the same vein as sehe’s, with a few changes which hopefully will address your comments:
As in saha’s solution make a (or append to)
.gittatributes.Now to implement the clean and smudge filters:
I’ve created a test file with the following contents, notice the one-line paragraph.
After we commit it to the local repo, we can see the raw contents.
So the rules of the clean filter are whenever it finds a string of text that ends with
.or?or!or''(that’s the latex way to do double quotes) then a space, it will add %NL% and a newline character. But it ignores lines that start with \ (latex commands) or contain a comment anywhere (so that comments cannot become part of the main text).The smudge filter removes %NL% and the newline.
Diffing and merging is done on the ‘clean’ files so changes to paragraphs are merged sentence by sentence. This is the desired behavior.
The nice thing is that the latex file should compile in either the clean or smudged state, so there is some hope for collaborators to not need to do anything. Finally, you could put the
git configcommands in a shell script that is part of the repo so a collaborator would just have to run it in the root of the repo to get configured.That last little bit is a hack because when this script is first run, the branch is already checked out (in the clean form) and it doesn’t get smudged automatically.
You can add this script and the .gitattributes file to the repo, then new users just need to clone, then run the script in the root of the repo.
I think this script even runs on windows git if done in git bash.
Drawbacks: