It is said, that once upon a time, programmers used CSV and collaborated on SourceForge, and that everyone was happy.
These days, are however, long gone in the mists of time as of 2020, and beyond Ciro Santilli's programming birth.
Except for hardware developers of course. The are still happily using Perforce and Tcl, and shall never lose their innocence. Blessed be their souls. Amen.
The fundamental insight of Git design is: a SHA represents not only current state, but also the full history due to the Merkle tree implementation, see notably:
This makes it so that you will always notice if you are overwriting history on the remote, even if you are developing from two separate local computers (or more commonly, two people in two different local computers) and therefore will never lose any work accidentally.
It is very hard to achieve that without the Merkle tree.
Consider for example the most naive approach possible of marking versions with consecutive numbers:
  • Local 1:
    • 0: root commit
    • 1: commit 1
    • 2: commit 2 by local 1
  • Local 2:
    • 0: root commit
    • 1: commit 1
    • 2: commit 2 by local 2
    • 3: commit 3 by local 2
  • Remote
    • 0: root commit
    • 1: commit 1
If Local 1 were to push to Remote first, how could Local 2 notice that when it tries to push itself? The navie method of just checking: "does Remote have commit "2"" does not work, because Local 2 has a different version of commit 2 than local 1.
Perfect Git integration belongs in integrated development environments :-)
Figure 1.
gitk 2.34.1 running on Ubuntu 22.04 with a simple repository.
This is good. But it misses some key operations, so much so that makes Ciro not want to learn/use it daily.
This is where Ciro Santilli stored his code since he started coding nonstop in 2013.
He does not like the closed source aspect of it, but hey, there are more important things to worry about, the network effect is just too strong.
Some amazing people have put book source codes on GitHub. This is a list of such repos.
The cheapest and most resilient way to publish text content humanity has achieved so far.
The heart/main innovation of GitHub!
GitLab was very important to Ciro because he wanted to base Booktree on it.
This is a quick presentation that goes over some of the most common difficulties people find with Git.
This is the most important thing to understand Git!
You must:
  • be able to visualize the commit tree
  • understand how each git command modifies the commit DAG
But not every directed acyclic graph is a tree.
Example of a tree (and therefore also a DAG):
5
|
4 7
| |
3 6
|/
2
|
1
Convention in this presentation: arrows implicitly point up, just like in a git log, i.e.:
  • 1 is parent of 2
  • 2 is parent of 3 and 6
  • 3 is parent of 4
and so on.
Example of a DAG that is not a tree:
7
|\
4 6
| |
3 5
|/
2
|
1
This is not a tree because there are two ways to reach 7:
  • 2, 3, 4, 7
  • 2, 5, 6, 7
But we often say "tree" intead of "DAG" in the context of Git because DAG sounds ugly.
Example of a graph that is not a DAG:
6
^
|
3->4
^  |
|  v
2<-5
^
|
1
This one is not acyclic because there is a cycle 2, 3, 4, 5, 2.
Because a Git commit can have more than 1 parent due to merge commits when you do:
git merge
It can even have more than 2, there's no limit. Although that is not so common (with good reason, 2 is already one too many): softwareengineering.stackexchange.com/questions/314215/can-a-git-commit-have-more-than-2-parents/377903#377903
There are two ways to organize a project:
  • linear history
  • branched history: history with merge commits
Some people like merges, but they are ugly and stupid. Rebase instead and keep linear history.
Linear history:
5 master
|
4
|
3
|
2
|
1 first commit
Branched history:
7   master
|\
| \
6  \
|\  \
| |  |
3 4  5
| |  |
| /  /
|/  /
2  /
| /
1/  first commit
Here commits 6 and 7 are the so called "merge commits":
  • they have multiple parents:
    • 6 has parents 3 and 4
    • 7 has parents 5 and 6
  • they are useless and don't contain any real information
Which type of tree do you think will be easier to understand and maintain?
????
????????????
You may disconnect now if you still like branched history.
Generate a minimal test repo. You should get in the habit of doing this to test stuff out.
#!/usr/bin/env bash

mkdir git-tips
cd git-tips
git init

for i in 1 2 3 4 5; do
  echo $i > f
  git add f
  git commit -m $i
done

git checkout HEAD~2
git checkout -b my-feature

for i in 6 7; do
  echo $i > f
  git add f
  git commit -m $i
done
For the newbs.
Slick? No. But gitk does the job, like any one of the other 100 billion free Git UI viewers out there
gitk master HEAD
https://raw.githubusercontent.com/cirosantilli/media/master/gitk.png
Many IDEs are also implementing this now (e.g. VS Code, Eclipse. Most free IDE GIt implementations are still crap, but that is the future, because you want to edit, view history, edit, view history, commit, edit.
For the strong.
git log --abbrev-commit --decorate --graph --pretty=oneline master HEAD
Output:
* b4ec057 (master) 5
* 0b37c1b 4
| * fbfbfe8 (HEAD -> my-feature) 7
| * 7b0f59d 6
|/
* 661cfab 3
* 6d748a9 2
* c5f8a2c 1
If we also add the --simplify-by-decoration, which you very often want want on a real repository with many commits:
* b4ec057 (master) 5
| * fbfbfe8 (HEAD -> my-feature) 7
|/
* c5f8a2c 1
As we can see, this removes any commit that is neither:
  • under a branch or tag
  • at the intersection of too branches or tags
Option 1) git commit. Doh!!!
Option 2) git rebase. Basically allows you to do arbitrary modifications to the tree. The most important ones are:
Before:
5 master
|
4 7 my-feature HEAD
| |
3 6
|/
2
|
1
Action:
git rebase
After:
7 my-feature HEAD
|
6
|
5 master
|
4
|
3
|
2
|
1
Ready to push with linear history!
Before:
7 my-feature HEAD
|
6
|
5 master
|
4
|
3
|
2
|
1
Oh, commit 6 was crap:
git rebase -i HEAD~2
Mark 6 to be modified.
After:
7 my-feature HEAD
|
6v2
|
5 master
|
4
|
3
|
2
|
1
Better now, ready to push.
Note: history changes change all commits SHAs. All parents are considereEven time is considered. So is commit message/author. And obviously file contents. So now commit "7" will actually have a different SHA.
Before
7 my-feature HEAD
|
6
|
5 master
|
4
|
3
|
2
|
1
Oh, commit 6 was just a temporary step, should be put together with commit 7:
git rebase -i HEAD~2
Mark 6 to be squashed.
After:
67 my-feature HEAD
|
5 master
|
4
|
3
|
2
|
1
Better now, ready to push.
Oh but there are usually 2 trees: local and remote.
So you also have to learn how to observe and modify and sync with the remote tree!
But basically:
git fetch
to update the remote tree. And then you can use it exactly like any other branch, except you prefix them with the remote (usually origin/*), e.g.:
  • origin/master is the latest fetch of the remote version of master
  • origin/my-feature is the latest fetch of the remote version of my-feature
In order to solve conflicts, you just have to understand what commit you are trying to move where.
E.g. if from:
5 master
|
4 7 my-feature HEAD
| |
3 6
|/
2
|
1
we do:
git rebase master
what happens step by step is first 6 is moved on top of 5:
6on5 HEAD
|
5 master
|
4                 7 my-feature
|                 |
3                 6
|                 |
2-----------------+
|
1
and then 7 is moved on top of the new 6:
7on5 HEAD
|
6on5
|
5 master
|
4                 7 my-feature
|                 |
3                 6
|                 |
2-----------------+
|
1
All good? so OK, let's move the my-feature to the new 7:
7on5 my-feature HEAD
|
6on5
|
5 master
|
4
|
3
|
2
|
1
The key to solve conflicts is:
You have to understand what are the two commits that touched a given line (one from master, one from features), and then combine them somehow.
Or in other words, at every rebase conflict we have something like:
master-commit    feature-commit
|                |
|                |
base-commit------+
|
|
Therefore there are 2 diffs that you have to understand and reconcile:
  • base-commit to master-commit
  • base-commit to feature-commit
diff3 conflict is basically what you always want to see, either by setting it as the default as per stackoverflow.com/questions/27417656/should-diff3-be-default-conflictstyle-on-git:
git config --global merge.conflictstyle diff3
or as a one off:
git checkout --conflict=diff3
With this, conflicts now show up as:
++<<<<<<< HEAD
 +5
++||||||| parent of 7b0f59d (6)
++3
++=======
+ 6
++>>>>>>> 7b0f59d (6)
7b0f59d is the SHA-2 of commit 6.
instead of the inferior default:
++<<<<<<< ours
 +5
++=======
+ 6
++>>>>>>> theirs
We can also observe the current tree state during resolution:
* b4ec057 (HEAD, master) 5
* 0b37c1b 4
| * fbfbfe8 (my-feature) 7
| * 7b0f59d 6
|/
* 661cfab 3
* 6d748a9 2
* c5f8a2c 1
so we understand that we are now at 5 and that we are trying to apply our commit 6
So it is much clearer what is happening:
  • master changed the code from 3 to 5
  • our feature changed the code from 3 to 6
and so now we have to decide what the new code is that will put both of these together.
Let's say we decide it is 5 + 6 = 11 and continue rebasing:
git add .
git rebase --continue
We now reach:
++<<<<<<< HEAD
 +11
++||||||| parent of fbfbfe8 (7)
++6
++=======
+ 7
++>>>>>>> fbfbfe8 (7)
and the tree looks like:
* ca7f7ff (HEAD) 6
* b4ec057 (master) 5
* 0b37c1b 4
| * fbfbfe8 (my-feature) 7
| * 7b0f59d 6
|/
* 661cfab 3
* 6d748a9 2
* c5f8a2c 1
So we understand that:
  • after the previous step we added commit 6 on top of 5
  • now we are adding 7 on top of the new 6 (which we decided would contain 11)
and after resolving that one we now reach:
* e1aaf20 (HEAD -> my-feature) 7
* ca7f7ff 6
* b4ec057 (master) 5
* 0b37c1b 4
* 661cfab 3
* 6d748a9 2
* c5f8a2c 1
These are good free newbie GUI options:
sudo apt install meld
git mergetool --tool meld

sudo apt install kdiff3
git mergetool --tool kdiff3
https://raw.githubusercontent.com/cirosantilli/media/master/meld.png
https://raw.githubusercontent.com/cirosantilli/media/master/kdiff3.png
Let's make a more interesting conflict:
git-tips-2.sh
#!/usr/bin/env bash

set -eux

add() (
  rm -f f
  for i in `seq 10`; do
    printf "before $i\n\n" >> f
  done
  printf "conflict 1 $1\n\n" >> f
  for i in `seq 10`; do
    printf "middle $i\n\n" >> f
  done
  printf "conflict 2 $2\n\n" >> f
  for i in `seq 10`; do
    printf "after $i\n\n" >> f
  done
  git add f
)

rm -rf git-tips-2
mkdir git-tips-2
cd git-tips-2
git init

for i in 1 2 3; do
  add $i $i
  git commit -m $i
done

add 3 4
git commit -m 4

add 5 4
git commit -m 5

git checkout HEAD~2
git checkout -b my-feature

add 3 6
git commit -m 6

add 7 6
git commit -m 7
git rebase does not tell you that, and that sucks.
We only know which commit from the feature branch caused the problem.
Generally we can guess or it is not needed, but imerge does look promising: stackoverflow.com/questions/18162930/how-can-i-find-out-which-git-commits-cause-conflicts

Articles by others on the same topic (0)

There are currently no matching articles.