Revision Control Using git

Introduction

The CASA development group uses Bitbucket for revision control. Under the hood, Bitbucket employs git but is also integrated with JIRA, Bamboo (for continuous integration), and Confluence (wiki-like documentation).  This article explains the basics of revision control using BitBucket and git. For information on advanced BitBucket options, look for the question mark icon from any screen in the BitBucket web interface. Advanced information on git is widely available on the Web. The description below is for following the default workflow with the simplest command use.

Clone the repository: git clone

To do CASA development, you will need to get a copy of the CASA repository on your local machine. In the git revision control model, you will make a 'clone' of the repository on the BitBucket server. The clone is effectively a local 'working copy' of the code with its own git repository and a remote connection called 'origin' pointing back to the original Bitbucket repository. You can edit files and commit changes in your cloned copy, then periodically push committed changes back to origin as well as pull changes from origin to the local repository.

To clone the repository:

  1. Navigate to the BitBucket home page.
  2. Select the CASA project.
  3. Select the CASA (?) repository. 
  4. In the left sidebar, click "..." and then "Clone." 
  5. Select HTTP as the clone mechanism. This will show you a URL that you can use to clone origin from the command line.
  6. On the local machine, change the directory to where you want to place the local git repository.
  7. Use git clone to clone from the URL.

git clone --recursive

You now have a local copy of the repository and a 'working tree' of code on which you can do your work. You can run simple commands to get information about your local repository:

cd casa
git branch
git status
git log

git branch lists the branches in the local repository, with the active branch indicated by *. After cloning, you will have only the master branch. git status indicates the active branch then lists the files which are staged, unstaged, and untracked (see section on committing changes below). git log lists the commits in the active branch with the commit ID, author, date, and message.

NOTE: Log output is paged automatically and does not need to be piped to a command such as 'more'.

 

Create a new branch: git checkout

ALERT: Never develop on the master branch!

Although it is possible to create a branch in your local git repository and push it to origin, doing so will break the JIRA/Bamboo interactions. Create the branch from JIRA!

Work on an issue begins with the JIRA ticket. Under normal conditions, you should only be doing work on one ticket at a time, and the work on each ticket should be done on a separate branch. To create the branch for your work:

  1. Start from the appropriate JIRA ticket.
  2. Click "Create branch." This will take you to the Bitbucket Create Branch form.
  3. Use the default repository casa/casa, select branch type Bugfix (default) or change to Feature (one of these two types is required for the CASA process to work properly) and branch from the default master. Set branch name to the JIRA ticket number (e.g. "CAS-1234").

    IMPORTANT: The default name includes the ticket's title but use the ticket number only; delete everything after the ticket number so that the CASA build and test process works properly.

  4. Click "Create Branch."

Now you need to add the new branch to your local repository so that you can begin using it. In your local repository, use the git checkout command to switch to a different branch. If the branch is not in the local repository, git will automatically look for it in origin, add it to your local repository, and track it with the same branch in origin (for push and pull).

git fetch origin
git remote show origin
git checkout bugfix/CAS-1234
git branch

First use git fetch origin and git remote show origin to list the remote branches. You should see your new branch, with a prefix determined by the branch type selected when creating the branch in JIRA (bugfix/ or feature/). Then use git checkout to switch to the new branch. You can use git branch to verify that you now have the requested branch and that it is the active one.  Now you are ready to develop on your branch.

 

Commit code changes: git status, git add, git commit

ALERT: Remember that git status and git commit act on your local repository only!

After you have edited existing files or added new ones, you may want to commit the changes to your repository. In other revision control systems, the commit command adds all of the modified files in your working directory to the repository.  In git, you can cherry-pick files to be committed together. For example, if you have three modified files in your repository that actually implement a bug fix and a new feature, you can commit two files for the bug fix separately from the third file for the feature. Therefore, the commit is actually a changeset, all of the files that implement a specific change. If the commit is unsuccessful because there is a conflict between one of your modified files and the repo version, the changeset fails rather than committing some files but not others.

First, you need to know which files have been changed or added. Use git status to list the files that differ from your local repository. These files can have three states: staged, unstaged, and untracked. In addition, git status lets you know if the files are modified or new.

staged: "Changes to be committed"

These files are tracked by git and have been added to the staging area with git add. Using git commit will add these changes to the repository. If you change your mind and do not want to commit a file yet, the git status output includes the git command to unstage a file: git reset HEAD .

unstaged: "Changes not staged for commit"

These files are tracked by git and are modified but will not be committed with the next git commitPerhaps this is what you want, as the files are not ready for commit or you do not want them in the next changeset. Again, the git status output includes the git commands to change the file's status: use git add to add the files to the staging area, or use git checkout -- to discard the changes and go back to the repository version (revert). 

NOTE: The double-minus after this git checkout command, which indicates the argument is a file not a branch.

untracked: "Untracked files"

These are files that git does not know about. They could be new code files, artifacts of building or testing your code, swap files if you have files open for editing, etc. If you want to add the file to the git repository, simply use git add .

Nothing to commit

As you would expect, the files which match the repository version are not listed in git status. If all files in the working directory match the repository and there are no untracked files, git status will return the message "nothing to commit, working directory clean".

Sample commit session:

  1. Edit file1, file2, and file3, and add file4. You want to commit file1 and file2 together as one changeset, then commit file3 and file4 as a separate changeset.
  2. git status  # indicates that file1, file2, and file3 are unstaged ("Changes not staged for commit"), and file4 is untracked ("Untracked files").
  3. git add file1 file2   # stage files for first changeset
  4. git commit  # puts you into an editor for the commit message, with a list of files to be committed
  5. git status  # indicates that file3 is unstaged, file4 is untracked
  6. git add file4  # add file4 to the files tracked by git; file4 is now staged, file3 is unstaged
  7. git commit -a  # commits all modified files (staged and unstaged), in this case file3 and file4. Alternatively, you could use git add file3; git commit.

Remember that unlike a centralized revision control system such as svn, git commit saves the changes to your local repository only. The origin is unchanged until you use git push.  In addition, the changes are committed on this branch only; if you switch branches, git log will not show this commit.

Remove a file from git

ALERT: git rm removes the file from your repository AND your working tree!

Perhaps you are pruning deprecated code from your code tree, or you accidentally added a new file to git. git lets you remove a file with git rm .

  1. You want to remove a tracked file. Simply use git rm to let git know that you want to delete this file from the repository; the file will be staged in git status as "Changes to be committed" with the label "deleted". Use git commit to complete the removal.
  2. You added a new (untracked) file with git add (now it is staged with the label "new file"), but you do not want it. Simply use git rm . You do not need to commit this time.

Remember, git rm doesn't just untrack the file, it removes the file from your directory! However, like all commits, the removal of a tracked file is on the active branch only; if you switch branches, the file may be restored in the new active branch.

Compare your repository to origin

Remember, git status reflects the state of your working tree with respect to your local repository.  Let's say your working directory has "nothing to commit", so all of your code changes have been committed.  But what is the status of your repository compared with origin?  Remember that for tracked branches git status will tell you if you are ahead or behind the remote branch, for example:

$ git status
On branch master
Your branch is behind 'origin/master' by 3 commits, and can be fast-forwarded.
  (use "git pull" to update your local branch)
nothing to commit, working directory clean

In this example,there are 3 commits in origin/master that are not in your local master branch.  You are helpfully told to use git pull to update your branch, but you may want to see what you would get before you do the pull and possibly postpone this step until later.

1. Fetch the remote (origin) to update your references.  You may want to run git status again to see if the number changes.

$ git fetch origin

$ git status

2. Use double-dot notation to see what commits are in your branch but not in master (what you would push):

$ git log origin/master..master

3. Use double-dot notation to see what commits are in master but not in your branch (what you would pull):

$ git log master..origin/master

4. If you are on the branch you want to compare, you can leave that part out:

$ git log origin/master..

$ git log ..origin/master

5. To do it all at once, use --left-right with triple-dot notation.  The commits with '<' refer to the branch listed first, to the left of the triple-dots, and '>' refers to the branch listed second, on the right.  The following example shows that F and E are only on origin/master, and D and C are only on the local master.  These letters represent git log entries with commit ID, author, date, and message.

$ git log --left-right origin/master...master

< F

< E

> D

> C

Push changes to origin

You have changed your code and committed changes, but these changes are in your local repository only. When you are ready to share your code changes with other developers or you want to try integrating your changes with the rest of the system, use git push to update the branch in origin.

Pushing changes to origin will merge master into the branch on the origin (this will not change the branch in your local repository), then trigger a build and level 1 test of your branch by Bamboo. When Bamboo finishes, it will send an email to the submitter with the results of the build and test. (Only after confirming that level 1 tests have passed should your JIRA ticket state be changed to "Ready to Verify" or "Ready to Validate.")

  1. Make sure you are on the branch you want to push, git branch.
  2. If not, check out the desired branch, git checkout .
  3. Push committed changes to origin, git push origin .
  4. TBD: log into Bitbucket server?
  5. Review email from Bamboo on branch build and test

 

Pull changes from origin (update your local repository)

If you think a branch has been updated in origin either by another developer or by a merge of the master branch (automatically when the branch is pushed), you can add these changes to your local repository with git pull .

  1. Make sure you are on the branch you want to pull, git branch.
  2. If not, check out the desired branch, git checkout .
  3. Pull changes from origin, git pull origin .
  4. This updates the commits in the log, git log.

 

Make a pull request (merge changes to the trunk)

When your JIRA ticket is Resolved, you can merge your branch into the master branch on the Bitbucket server by creating a pull request. If some time has passed since you created the branch or merged master into it, you should update the branch before the pull request:  git checkout master; git pull origin master; git checkout dev_branch; git merge master; git push origin dev_branch. After the branch package and tests are successful, you must initiate a pull request to inform the reviewers that your branch is ready to be merged into the master branch.

In the JIRA ticket, there is a 'Development' section on the right, which lists branches, commits, and builds. Click on the "branch" link, which will open a list of the branches created from the ticket (most likely only one). For the branch you wish to merge, click "Create pull request" in the 'Action' column.

Complete the Bitbucket "Create pull request" form, which already has the branch name as the Title and commit messages as the Description.  If the ticket requires release notes for this change, add a "Release Notes:" section at the end of the Description.

Review the Diff and Commits tabs at the bottom to ensure that only your changes are listed.  If other files are included, you may be reverting others' code changes from your outdated branch.  It is easier to fix this now than after the branch is merged into master!

Click "Create".  You may also "Cancel" if you need to fix something after your review.  Once the pull request is created, Bamboo will launch Branch Test Suite 3 which will run the Critical tests on the EL6 tarball, making it more likely that the pull request will not break the master branch.  The pull request reviewers will generally wait until this test completes before approving and merging your pull request.  Please be patient, as these tests can take ~10 hours to run.

If a pull request is approved and merged but the master Test Suite 4 fails, your pull request may be reverted in a new pull request, in order to restore master to a good state.  Your pull request is reverted as a whole, not just the part that caused a test to fail.  To reapply these changes:

  • Make a new ticket and branch for the fix
  • Find the commit ID of the reversion pull request using "git log".
  • Run "git revert " to reapply the changes in your first pull request.
  • Add your fix, commit, etc., and create a new pull request when the builds and tests succeed.

Switch branches

The normal workflow is to work on one ticket at a time until completion, but it could happen that you need to switch to another task before it is done. Examples include: (1) a more urgent bugfix comes up that needs your immediate attention; (2) input is required before further progress can be made, so you want to begin work on another issue; or (3) you need to update from master to get the merged changes from another branch before continuing (see also section on master merge conflicts). In these cases, you will want to switch to another branch.

Checkout with clean working directory

To make a different branch active, simply use git checkout, as you did with the new branch above:

git status  # working directory clean
git checkout feature/CAS-1245  # or whatever branch you need, as shown above

If the branch is already in your local repository, git will make it the active branch. If not, git will find the branch in the origin, add it to the local repo, and switch to it. The switch happens instantly if your working directory is clean ("nothing to commit", as explained in the section on commit). Some source files will probably change with this branch change, so you may want to recompile your code to make a new build.

Checkout with dirty working directory

If, however, you do have modified files in your branch (the working directory is 'dirty'), git will return an error such as:

error: Your local changes to the following files would be overwritten by checkout:
      code/file1
Please, commit your changes or stash them before you can switch branches.
Aborting

Along with the error, git gives you the helpful advice to commit your changes or stash them.

Sample checkout session with git stash

This shows the simplest case, where you store one stash on the stack and the branch did not change between stashing the stack and applying it to the same branch.

git branch  # indicates you are on bugfix/CAS-1234
git status  # indicates you have modified files
git stash  # stash this branch's changes and revert to clean working directory
git checkout feature/CAS-1245  # switch branches and develop there
git commit -a  # save CAS-1245 changes
git status  # nothing to commit
git checkout bugfix/CAS-1234  # make CAS-1234 the active branch
git stash pop  # retrieve modified files and pick up where you left off!

 

Stash changes

What if you need to save changes but they are not ready to commit? This could happen if you want to switch branches but you have modified files, or if you want to try an alternate approach but be able to retrieve the current implementation later. You can use git stash.

git stash stores a record of the current state of your working directory on a stack, then reverts the working directory to a clean state (the last commit). To see the stashes you currently have, use git stash list; which shows the stash name (stash@{0}, stash@{1}, etc., with 0 being the top), the branch you were on when you stashed, and the last commit the stash is based on (i.e. what your working directory was reverted to).

To retrieve your changes, use git stash pop  (apply the changes and remove the stash from the stack) or git stash apply (apply the changes and leave the stash on the stack). Notice that there is one stash stack for all of the branches in the repository and you apply the changes to the current active branch. Therefore you can pop the stash to the same or a different branch than it came from; this may or may not be what you intended so be careful. Popping the stash could result in conflicts when the changes are applied. 

Sample git stash session

git checkout feature/CAS-1234
vi file1.cc
git stash  # file1.cc changes go on stack, file1.cc is reverted
git checkout bugfix/CAS-1235 # until done with CAS-1235, then:
git checkout feature/CAS-1234 # with repo version of file1.cc
git stash pop  # get modified file1.cc back, continue work

 

Update branch with master and submodule changes

In the CASA workflow, a push to origin triggers a merge of the master branch into the branch on origin, which may result in build errors or conflicts (see Example 3 in the Branch workflow examples), which you should resolve in your local branch. Or you may be aware of code changes in master from pull requests or submodule updates that you would like to add to your branch. This involves merging an updated local master into the local branch.

ALERT: Remember that the active branch is the one being changed!

Sample session to merge master and resolve conflicts

Start with a clean working directory in the branch you are working in; if it is not, commit or stash your changes.

git checkout master
git pull origin master

At this point, running git status may indicate that casacore is "Changed but not updated" (locally modified and not staged for commit). To resolve this, run

git submodule update

to get your master branch on track. Then continue in your development branch:

git checkout bugfix/CAS-1234
git merge master  # this merges master into CAS-1234
git add casacore
git add asap
git commit --amend
    
# To resolve conflicts
vi file1  # edit files with conflicts
git add file1
git commit
git push origin bugfix/CAS-1234  # push fix back to origin; triggers merge and tests

NOTE: You may follow this procedure at any time in your local repository (with the optional final push), in order to work with updated code while developing your branch or to handle potential merge conflicts before the merge in origin after a push.

For updates to the casacore and asap submodules, see the section below

IMPORTANT: If you want to modify a stale branch (e.g. you must re-Schedule the branch for more work after Validation testing and some time has passed), you must update the submodules in your local repository.  A push to origin with the master merge there will cause build issues with the submodule pointers.

 

Delete the branch from your repository (optional)

This step is not required by the workflow, but is something you will probably want to do once your work on the branch is complete (the pull request has been done and the JIRA ticket is Closed). Otherwise, the list returned git branch will get mighty long. Deleting a branch is easy, and should you find you need the branch again, you can always get it from origin with git checkout.

  1. Make sure you are not on the branch you want to delete, e.g. git checkout master.
  2. Delete the branch, git branch -d . (if git complains that the branch was not fully merged, you can use -D to force the delete)
  3. Use git branch to verify that the branch is no longer listed.

 

When a feature requires both casacore and casa change

 

1. Create a Casacore fork in GitHub

2. Create a Casa branch in BitBucket

3. Clone the repository and checkout your branch

git clone --recursive https://open-bitbucket.nrao.edu/scm/casa/casa.git

cd casa

git checkout feature/CAS-1234

4. Create a casacore branch

cd casacore

git checkout master

git pull

git branch mycasacorefeature

git checkout mycasacorefeature

5. Make your changes in casacore

6. Make your changes in the rest of the branch

7. Test locally

8. Push the Casacore changes to your fork in GitHub

cd casa/casacore

git remote add mycasacore https://github.com/vsuorant/casacore

git push mycasacore mycasacorefeature

9. Create a pull request in GitHub

10. Wait for the pull request to be applied (you must wait since the master submodule doesn't know about your fork, so you can't point the submodule there)

11. Update the submodule reference in your branch

cd casa

git checkout feature/CAS-1234

cd casacore

git checkout master

git pull

cd ..

git add casacore

git commit --amend (this will amend your latest commit. If you would rather have a separate commit, leave the --amend out)

12. Push your changes to BitBucket

git push origin feature/CAS-1234

Creating a patch for both release and master branches

Option 1: Branch both master and release

1. Create a branch from release/ with your Jira ticket number.

2. Make your changes and push the branch to Bitbucket for testing.

3. Create another Jira ticket to backport the changes to master, branch from master using the new Jira ticket number, copy your changes there and push to bitbucket for testing.

4. Create pull requests from both branches.

Option 2: Create single branch that is mergeable to both master and release

When creating a patch that can be applied in both the master and prerelease (or any other branch), it is useful to find the last common ancestor of the branches. Using the common ancestor will prevent unwanted changes from getting applied from one branch to another. Use the following steps to create a branch that can be applied in both branches.

1. Find he last common ancestor and create a branch based on it.

git checkout -b bugfix/myjiraticket `git merge-base origin/release/5.0.0 origin/master` 

2. Commit your changes to your bugfix branch and push your branch to Bitbucket.

3. Wait for all of the build/test tasks to complete.

4. Create a pull request to both release/5.0.0 and master.