Home > Software > Collaborative Paper Writing using Git and Drop Box / Google Drive

Collaborative Paper Writing using Git and Drop Box / Google Drive

Distributed version control software (git, mercurial, …) provides a safe and convenient means for people to collaborate on projects, including the writing of academic papers using latex.  Although there are various ways of setting up an online repository, perhaps the fastest way to start is by using Drop Box or Google Drive (or equivalent) to share a repository for a specific project.

Warning: If two people ‘push’ to the repository at the same time then corruption may occur. Therefore, pushes should be infrequent and/or coordinated via email.  (Although ending up in this situation is best avoided, it is possible to recover by recreating the shared repository from the local repositories.)

Note: A long hypen (—) actually means two hyphens next to each other, which is the standard way of specifying ‘long options’ on the command line.

The following instructions are valid for linux or Mac, using a recent version of git (tested with git version 1.7.11.1).

Installing and Configuring Git

Installing git is straightforward; just go to the download page.  To configure git, at a minimum, run:

git config —global user.name “Jonathan Manton”
git config —global user.email “my.email.address@gmail.com”

If you want to learn more about git, view the free online book.

Optional

To change the default ‘merge’ program to vimdiff, run:

git config —global merge.tool vimdiff

To automatically ignore certain files globally, create the file (any filename will do) ~/.gitignore_global and place in it something like:

*.o
*.so
*.bak

Next, run:

git config —global core.excludesfile ~/.gitignore_global

Note that it is “excludesfile” and not “excludefiles”! You will not receive an error if you type the wrong name because all git config –global does is insert entries into the file ~/.gitconfig (this file can be edited by hand if you prefer).

Creating the Shared Repository

(Only the person creating the shared repository need read this section.) Create a new directory in Drop Box or Google Drive, and change directory into it:

cd ~/Dropbox
mkdir NewPaper.git
cd NewPaper.git

Initialise the git repository:

git init —bare

Move to where you would like to have your local copy of the paper (not on Dropbox!) then clone the repository and change directory into it:

cd ~/Desktop
git clone ~/Dropbox/NewPaper.git
cd NewPaper

Do not worry about the warning that we have cloned an empty repository. We are about to populate it before sharing with our collaborators.

Create a .gitignore file with the following contents:

*.pdf
*.bbl
*.aux
*.log

Create skeleton files for the project (e.g. a tex file and a bib file).  Run pdflatex and check everything works.  Then:

git status
[check that the files produced by pdflatex are ignored (modify .gitignore if this is not the case) and the files that you want are showing up as Untracked files.]

Add the files to git then commit:

git add .
git commit
[An editor will open up; type in a message describing this commit, e.g. “Skeleton files”.]

Finally, these changes need to be ‘pushed’ to the shared repository:

git push origin master

The directory NewPaper.git on Drop Box or Google Drive should now be shared with your collaborators so that it shows up on their computers.

Creating a Local Repository

Each person will work in their own local repository.  The workflow is described in the next section. This section states how to create a local repository.  (The person who created the shared repository, as described in the previous section, already has a local repository, namely ~/Desktop/NewPaper, and can skip this section.)

cd ~/Desktop
git clone ~/Dropbox/NewPaper.git
cd NewPaper

Collaborative Workflow

The following is the same for both the person who created the repository as above, and for the other collaborators who have shared access to the Drop Box / Google Drive folder containing the repository.

Henceforth, it is assumed there is a directory called ~/Desktop/NewPaper that contains the local git repository.  (Because this local repository was created using ‘git clone’, it knows the location of the shared repository.)

The basic philosophy is that the shared repository ~/Dropbox/NewPaper.git should always contain a version that compiles correctly. Therefore, each collaborator should work in their own local repository, and only ‘push’ changes to the shared ~/Dropbox/NewPaper repository that are ready for sharing. (This is known as the “centralised workflow”. Alternative workflows may be preferable in more complex situations.) Furthermore, since two people pushing at the same time can corrupt the shared repository (a consequence of using Drop Box instead of a proper git repository), pushes should be infrequent and/or coordinated by email.

First check everything is ok:

cd ~/Desktop/NewPaper
ls
git status
git log

If you haven’t touched the repository for a while (and you have not made any modifications since the last commit), it is always a good idea to get the latest changes from the shared repository:

git pull

Then you can start editing files freely, and running pdflatex to compile.  It is a good idea to frequently commit, to allow you to go back to an earlier version should something go wrong.  Committing does not affect the shared repository.

git commit -a
[In the text editor which will open up, enter a description of the changes you made.]

If you have created a new file (e.g. my_figure.pdf that will be included in the latex document) that you wish to add to the repository, then just prior to committing (e.g. after you have included the figure in the latex file and ran pdflatex to check it all works), type:

git add my_figure.pdf
git commit -a

The interesting bit comes when you go to push your version to the shared repository.  You can try:

git push

If you are lucky, it will work.  Otherwise, someone pushed before you, and there are effectively two branches (versions) that must be merged.  Provided you have been working on a different part of the latex file from your collaborators, the changes can be merged automatically.  Try:

git pull

Note that here, we are basically using ‘git pull’ as a shortcut for ‘git fetch’ followed by ‘git merge’.  If it worked, we are done, otherwise there are two options. Either edit the files that have conflicts (the output of ‘git pull’ will state which files they are) and fix them manually, or run ‘git mergetool’. At this point, it is best to consult the documentation on basic merge conflicts.

In essense then, the workflow is cyclic: ‘git pull’ followed by the editing of files and one or more ‘git commit -a’, then finally ‘git push’ (followed by ‘git pull’ and ‘git push’ again if a merge was required).  At any point, ‘git status’ and ‘git log’ can be run.

Advertisements
  1. L.
    March 3, 2013 at 1:28 pm

    Warning: If two people ‘push’ to the repository at the same time then corruption may occur. Therefore, pushes should be infrequent and/or coordinated via email. (Although ending up in this situation is best avoided, it is possible to recover by recreating the shared repository from the local repositories.)

    You may want to use a proper git hosting platform, such as github (or bitbucket, if you want free private repositories). If you can afford a cheap hosting server, you can even set up a git host there (gitolite).

    The problem with using a shared directory is that, as you state, you may corrupt a repository if there is concurrent access. Dropbox makes it worse, because it is asynchronous (you save to the dropbox folder, then it gets updated in the server, and then it gets downloaded by the client).

    • March 3, 2013 at 2:12 pm

      Thanks for stressing this important point. In the intended context, where pushes might number one or two per week, and the material being pushed is private, it is simplest to acknowledge the risk, and mitigate it by coordinating via email. (This is essentially the same philosophy that says premature optimisation is the root of all evil.) Put another way, academics already use Drop Box to write collaborative papers, yet rarely have access to a private git host. Since coordination via email is required anyway (if two academics simultaneously open the paper on Drop Box then start making changes and saving, a mess will occur), using git does not introduce any additional disadvantages in this context yet for a very small amount of extra effort, brings a number of advantages.

  2. March 12, 2013 at 8:52 am

    Are there any software programs that will do the heavy lifting automatically? While this is useful, it seems incredibly cumbersome for anyone unfamiliar with git. Either way, thank you for outlining this information, it is very informative. 🙂

  3. October 24, 2014 at 1:57 pm

    I don’t understand this. Why wouldn’t the functionality of Git, including branches and pull requests be how to manage collaboration? This means you need Git on a server somewhere on the Internet. Dropbox is good for somethings but using it as the single Git repository is a bit ridiculous.

    • October 25, 2014 at 9:14 am

      The context is the following. Some academics (e.g., physicists, mathematicians, engineers) use LaTeX to write papers. Dropbox has become common for the sharing of the .tex file. Such academics generally do not have the desire to host and manage a git server (especially, not wishing to have to create accounts for each new collaborator on each new paper they write), nor is there any need for branches. The number of people actually editing the document is often essentially just one; one person does most of the writing while the other co-authors periodically check and make suggestions. (Occasionally, a co-author may edit the document directly, but this will be coordinated by email.) With a minimum of overhead, is it possible to go one better than sharing a .tex file using Dropbox? Personally, I like the safety and convenience of version control – accidentally deleting parts of a document while editing will be detected when doing a “git diff” before committing, and sometimes a part of a document is removed intentionally but later it is decided it is better in than out. A lightweight solution for a lightweight problem. (One could argue git is overkill, but the basic commands are simple enough to learn.)
      So in summary, it is simply the basic version control aspects that are of interest; generally there is no need for simultaneous editing and branches etc. In other words, ‘collaborative’ paper writing is generally not as ‘collaborative’ as, say, joint software development.

  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: