Chapter 6 GitHub/git Resources

6.1 Overview

This section describes workflows for working with GitHub/git and advice on how to collaborate in teams on large coding projects. GitHub is super useful and powerful, but people also find it quite annoying and difficult to understand. We suggest taking it one step at a time, beginning with the basic workflows outlined below. You can derive great benefits from it without being an expert.

6.2 First things first

  1. Install Git. To do so, follow the instructions in the Install Git chapter of Happy Git with R.

  2. Tell git your name and email address. Introduce yourself to Git in Happy Git explains it all.

  3. Set up a personal access token. See Personal access token for HTTPS. (When you create the token, you will be given a choice of “Beta” or “classic”. Choose classic.)

  4. (Optional) Check that your setup works. See Connect RStudio to Git and GitHub. To be clear, you don’t have to do any connecting – the steps in this chapter check that it’s all working properly.

6.3 Usage

Choose the right section based on what you’re trying to accomplish:

6.4 The no branch (besides main) workflow

To get comfortable with Git, start with this basic workflow in which you will be pulling from and pushing to your repo on GitHub. Just you, no collaboration:

The Connect RStudio to Git and GitHub chapter of Happy Git will get you set up: you will create a repo on GitHub, clone the repo into an RStudio project, and practice making changes. Once you’re set up, your local workflow will be pull, work, commit/push.

PULL Each time you open RStudio and switch to the project, you will pull down any changes made to the repo on GitHub by clicking the Down Arrow in the Git pane of RStudio. You may think that no changes have been made to GitHub and there’s nothing to pull, but you may forget the typos that you fixed online, so it’s good practice to always start by pulling changes just in case.

WORK Do your stuff. Make changes to files. Add new files. Keep an eye on the Git panel in RStudio; it will show you which files were changes.

COMMIT/PUSH When you’re done working, you’ll want to think about what to do with the files that have changed. I like to keep the Git panel clear, so when I’m done I do one of three things with each file: 1) click “Staged” to get it ready for a sendoff to GitHub, 2) delete it if it’s not needed, 3) add it to .gitignore if it’s a file I want to keep locally but not send to GitHub. (Keep in mind that files in .gitignore are not backed up unless you have another backup system.) If I have a file that belongs somewhere else, I will move it there, so the only files left are the ones to send to GitHub.

The next step is to click the Commit button and enter a commit message that meaningful describes what was done. Finally clicking the Up Arrow will send the commit to GitHub.

It’s not considered good practice to commit too often, but as a beginner, it’s useful to do so to learn how it all works.

6.5 Your repo with branching

Once you’ve comfortable with the workflow described above, you’re ready to start branching.

If it’s your repo–or you have write access to someone else’s repo–begin by cloning the repo as in the workflow above. For emphasis: if you have write access to someone else’s repo, you should still clone the repo, not fork it. The repo will be referred to as “origin” even though you’re not the owner; origin simply means the repo from which the project was cloned. Next, continue with step 4 below, or follow the steps in these slides, which provide step-by-step detail on creating a branch, doing work on the branch and then submitting a pull request to merge the changes into origin/main.

At any point, you can check the remotes (in this case, GitHub repositories) that are linked to your project with git remote -v. If your GitHub username is person1 and you have write access to a repo called finalproject created by person2, your remotes will look like this:

$ git remote -v

6.6 1st PR on another repo with branching

Step 1: Fork the upstream repo (once)

Skip this step if you are syncing with a repo that you have write access to, whether it’s your own or someone else’s. Let’s say you want to contribute to! Fork our GitHub repo and then on your own GitHub page, you will see a forked edav2 repo under the repositories section. Note, from now on, the term upstream repository refers to the original repo of the project that you forked and the term origin repository refers to the repo that you created or forked on GitHub. From your respective, both upstream and origin are remote repositories.

A fork of tidyverse/forcats

Step 2: Clone origin and create a local repository (once)

A local repository is the repo that resides on your computer. In order to be able to work locally, we need to create a local copy of the remote reposiotry. (For this to work you must first follow the instructions in First things first section.)

On your GitHub repo page, copy the url of the origin repo by clicking on the green Code button and then the clipboard icon. It should look like this: Then switch to RStudio, and click File -> New Project -> Version Control -> Git. Now you can paste in the url of the origin repo and click Create Project to create a local repository. It is best to choose a location that is outside of other version control systems such as Dropbox to avoid conflicts.

Step 3: Configure remote that points to the upstream repository (once)

Skip this step if you’re syncing with a repo that you have write access to. The purpose of this step is to specify the location of the upstream repository, that is, the original project, not your copy of it.

To complete this step, type in the following at the command line:

$ git remote add upstream <upstream repo url>

Source: Configuring a remote for a fork

Once the upstream remote is added, you will have two remotes: origin and upstream. For example, the remotes for my (jtr13) local forcats repository are:

$ git remote -v
origin (fetch)
origin (push)
upstream (fetch)
upstream (push)

(Although four options are listed, that is, fetching or pushing from either remote, as the diagram above indicates, we will only fetch from upsteam and push to origin.)

Step 4: Branch

With this workflow, all new work is done on a branch, so it’s important to remember to create a new branch before you begin working. Once the work is complete, a pull request is submitted and if all goes well the new code will be merged into the main branch of the project on GitHub.

When you’re ready to start working on something new, create a new branch. Do not reuse a merged branch. Each “fix” should get its own branch and be deleted after it’s been merged.

To create a branch, click on the button shown below:

Give your new branch a meaningful name. For example, if you intend to add a faceting example to the histogram chapter, you might call your branch add_hist_facet. Leave the “Sync branch with remote” box checked. Thereby you will not only create a local branch but also a remote branch on origin, and the local branch will be set up to track the remote branch. In short, they will be linked and git will take note of any changes on the other.

Step 5: Work, commit and push

When you create a branch following the method in Step 4, you will be automatically switched to the new branch. You can switch branches by clicking on the branch dropdown box to the right of the new branch button. However, be careful doing so. Work that isn’t committed, even if it is saved, doesn’t belong to a branch so it will move with you as you change branches. This makes it easy to accidentally be on the wrong branch. Check that you are in the right place and as you work keep an eye on changed files in the Git pane.

Recall that there are three steps to moving saved work from your working directory to GitHub, represented by the git commands: add, commit, and push.

In RStudio, to add, you simply click the checkbox for each file you have modified in the “staged” column on the left of the Git pane. To commit, you just click on the commit button under the Git tab. Entering a commit message is mandatory; choose a meaningful description of the code changes. Finally, to push changes to GitHub, click on the push button, which is represented by an upward pointing arrow. You can combine multiple commits into one “push”.

It is not considered good practice to commit too enough because all the commits are entered into the commit history and it’s hard to find what you need if you commit your work every five minutes. (As you’re starting out, though, I wouldn’t be concerned about this. It’s more important to use the commands frequently to gain experience.)

The Repeated Amend chapter of Happy Git with R describes one approach to dealing with the how-often-should-I-commit dilemma.

Step 6: Submit a pull request

Now you are able to see the branch you have created on the GitHub page. The next step is to submit a pull request and the process is very similar to the process described in the GitHub only walkthrough from the first version of, beginning with step 6.

Step 7: Merge the pull request

If you submitted a PR to another project, you are not the one who will be merging the pull request, so there’s nothing for you to do here.

If it is your project, and it is your job to do so, be aware that there are many methods to merge a request. The most direct simple and direct is to merge the PR on GitHub. This method works well for merging fixed typos and the like. If you want to be able to test code, you may want to check out the PR locally, test it, and perhaps even make edits to it before merging.

Best practices in this area are evolving. My current recommendation is to use the usethis package, which makes complex tasks very simple. “How to edit a pull request locally” explains how to do so.

Another great resource is “Explore and extend a pull request” in Happy Git with R. This chapter describes two official GitHub versions of merging a pull request, as well as a workflow in development using git2r.

6.7 2nd-nth PR on another repo with branching

After the first pull request, the process changes a little. We no longer need to fork and clone the repo. What we do need to do though is make sure that our local copy of the repository is up to date with the GitHub version. There is some other cleanup we need to do, so after the first pull request, we’ll replace steps #1 - #3 above with the following:

Step 1: Sync

How you sync depends on whether you are syncing with your own repo (“origin”) or someone else’s repo (“upstream”). This should be done at the beginning of every work session.

Your repo

Switch to the main branch (important!), then click the pull button (down arrow) in the Git pane in RStudio. Or you can type the following in the Terminal:

$ git checkout main
$ git pull

There are no reminders that you’re behind, so it’s up to you. Make it a habit.

Someone else’s repo

If you’re working on someone else’s repo, make sure you’ve configured an upstream remote.

Then do the following to update your fork:

$ git fetch upstream

$ git checkout main

$ git merge upstream/main

Source: Syncing a fork

Note that these commands bring in changes directly from the upstream repo. If you are working on a branch and want to sync, check out that branch rather than main and fetch / merge. Switching branches can be performed by clicking on the appropriate branch in RStudio instead of git checkout.

Step 2: Delete the old branch

If your previous pull request was merged, it’s good practice to delete the associated branch since the upstream already contains all the changes you have made. To fully delete a branch you will need to 1) delete the local branch, 2) delete the remote branch, 3) stop tracking the branch:

  1. One way to delete the remote branch is to do so on GitHub. Navigate to the closed pull request on the upstream repo. If your branch has been merged, the pull request dialogue will display the following message: “You’re all set—the <branchname> branch can be safely deleted.” Simply click on the Delete branch button next to the message.

If you prefer to work in the terminal, you can delete the remote branch with:

$ git push origin --delete <branchname>

  1. To delete the local branch, switch to the main branch in RStudio and then type the following in the terminal:

$ git branch -d branchname

  1. Take note that git doesn’t stop tracking the remote branch even though it’s gone in both places! To stop tracking deleted branches use the following:

$ git fetch -p

Otherwise you will still see the deleted branches listed in RStudio’s Git pane, and they will still appear when you look at all of your branches with

$ git branch -a (* = checked out branch)

Speaking of which, be aware that the Git pane doesn’t tend to update in real time, so you’ll likely still see branches listed that have been deleted. Be careful not to switch to them, or you will inadvertently recreate them. (Deleted branches have a habit of coming back.) Clicking on main (even though you’re already on main) appears to trigger an update of the dropdown list. If that doesn’t work, switching out of the project and back in will do so, if you want to be sure that the branches you deleted are really gone.

Step 3: Update your fork on GitHub

Skip this step if there’s no upstream repo.

Yes, it’s odd, but once you’ve forked and cloned the project repo, the copy on GitHub becomes fairly irrelevant. However it’s not a bad idea to keep it up to date, if for no other reason than it’s disturbing to see messages like the following in your Git pane:

Thankfully, your fork on GitHub can be updated easily by clicking the green up arrow or entering git push in the Terminal.

Steps 4-7: See above

Now we’re ready to repeat the branch, work, commit, push, submit a pull request workflow. To do so, follow Steps 4-7 above.

6.8 Fixing mistakes

Fixing things generally involves returning to an earlier point in git history. To do so we need a way of referring to the point we want to return to. There are multiple ways of referring to the past. Most of the examples below use HEAD (the last commit) or HEAD~1 (the parent of the last commit, a.k.a. the 2nd to last commit).

If you are contributing to someone else’s repo and things on your end get hopelessly messed up, the easiest way out is to start over. First be sure to keep a local copy of the files you need in a new folder, then delete your fork on GitHub, delete the local clone, fork again on GitHub, clone again to get a fresh local copy, and add the files you need back into the project. This is a variation of the “Burn it all down” described in Happy Git with R.

6.8.1 Forgot to branch

  • if you didn’t commit anything yet:
    Just create the new branch and your work will be moved there, as changes in the working directory do not belong to a branch until they are committed.

  • if you committed but didn’t push to GitHub:
    Undo the last commit with git reset --soft HEAD~1 (i.e. return to the parent commit). --soft means your changes won’t be erased. Then create the new branch. (Your work will move to it, see above.)

  • if you committed and pushed to GitHub:
    Undo the last commit with git reset --soft HEAD~1, push to GitHub (you will need to use git push --force since the push will cause origin to lose commits, then create the new branch. (Your work will move to it, see above.)
    Note: Be very, very careful with --force as it is a dangerous tool. However, since the stakes are lower if you are force pushing to a fork, as would be the case if you are contributing to someone else’s repo, it’s ok to use it in this particular case.

6.8.2 Undoing stuff

  • Undo the last commit: git reset --soft HEAD~1
    (Fun fact: “How do I undo the most recent local commits in Git?” has the second highest number of votes of any question on Stack Overflow, and over 8 million views.)

  • Undo changes since the last commit: git reset --hard HEAD

  • Undo changes in one file since the last commit:
    git checkout -- [filename] SO link

  • Undo deleted branch: Look for the SHA (hash) returned when you deleted the branch. Then:
    git checkout -b <branch-name> <SHA>
    SO link

  • Remove all new, untracked files: git clean -f

  • Remove all new, untracked files, including in new subdirectories: git clean -f -d

  • Revert to a previous commit Using Git — how to go back to a previous commit

6.8.3 Random

  • Make local the same as origin:
    git fetch origin
    git reset --hard origin/main SO link

  • Add back a file that was deleted but still exists on another branch:
    git checkout otherbranch myfile.txt SO link

  • Get rid of a file on GitHub that was added to .gitignore but is still there:
    git rm --cached [filename]
    (then commit and push changes to GitHub) SO link

  • Completely remove a folder from git history (use with caution) SO link

6.9 Troubleshooting

6.9.1 Deleting a branch isn’t working

  • Make sure you’re not on the branch you’re trying to delete.

  • Note that if you try to delete a branch that hasn’t been fully merged, you’ll get a warning, or perhaps an error depending on what’s transpired. It’s possible that it just thinks it hasn’t been merged even though it has, since you’re not up to date. This can be remedied with git pull. In other cases, you’ll need to follow the instructions to use -D instead of -d, for example, if you decide to abandon and delete a branch without submitting a pull request.

  • If you have trouble getting rid of branches, rest assured, that you’re not alone. How do I delete a Git branch locally and remotely? is the third most asked question on StackOverflow!

6.10 Other resources

Getting Help

If you’re lost, these might help.

  • GitHub Guides: This is a phenomenal collection of short articles from GitHub to help you learn about the fundamentals around their product. They are so great, we have already listed their Hello World article. Here are some other important ones:

  • GitHub Help: This is the yellow-pages of GitHub. Ask a question and it will try to push you in the right direction. Get it?

Branching out

GitHub is super social. Learn how to git involved!

  • Open Source Guide: Info on how to contribute to open source projects. Great links to the GitHub skills involved as well as good GitHub etiquette to adopt.

  • Forking Projects: Quick read from GitHub on how to fork a repository so you can contribute to it.

  • Mastering Issues: On what Issues are in GitHub and how they can help get things done.

  • Our Page on Contributing: You can contribute to with your new-found GitHub skills! Checkout our page on how to contribute through pull requests and/or issues.

More Resources

To hit the ground running, checkout GitHub Learning Lab. This application will teach you how to use GitHub with hands-on courses using actual repos. Its the perfect way to understand what using GitHub looks like.

For the nerds in the room

  • Git For Ages 4 And Up: There’s a lot going on under the hood. This talk will help explain how it all works…with kids toys!

  • Make pretty git logs: Always remember (A DOG). Also, this alias command is nice to have around:

    • git config --global alias.adog "log --all --decorate --oneline --graph"
  • add and commit with one command: Another (even more) helpful alias command:

    • git config --global alias.add-commit '!git add -A && git commit'
  • Git Aware Prompt: An excellent add-on to the Terminal that informs you which branch you have checked out. Someone also made an even spiffier version where it will inform you of your git status using helpful emojis.

  • Contributing with git2r, on the Population Genetics in R provides helpful information on using git commands within R through the git2r package. In particular it explains how to create a GITHUB_PAT and then set the credential parameter in certain functions to find the PAT. (Note though that the site was created in 2015 and as of February 2019 has not been updated.)

  • Want a little reading as well?: Resources to learn Git is a simple site split into two main sections: Learn by reading and Learn by doing. Take your pick.