Python Git: Learning about Git, Git Repositories and GitPython

python gitPython is a popular, high-level programming language. The language is meant to be simple and readable, both on the small and large scale. The latest major version of Python, Python 3.0, was released in 2008. It is not backwards compatible with the earlier versions and has several new major features. Python supports multiple programming paradigms, like object-oriented programming, structured programming, aspect-oriented programming, functional programming and logic programming. The language has a good garbage collector and it also supports Unicode. One of the unique features of Python is that the language lets you do more with little code, unlike other languages like C and Perl. Python programming is all about finding a single obvious way to carry out a programming task, instead of searching and coding in multiple ways, like they do in Perl. This makes Python an easy language to learn, even for beginners. You can take our Python programming course to get started with the language.

In this tutorial, we’re going to take a look at Git, Git repositories and GitPython, a python library that lets you handle Git repositories. You need to be familiar with the basics of Python to understand it.

What is Git?

Git is a distributed version control system software product. It lets you create and manage Git repositories. The software was developed by Linus Trovalds in 2005. While originally intended for Linux, the software has been ported to other major operating systems, like Windows and OSX. Git is compatible with Python, as well as some of the other major programming languages like Java, Ruby and C. C was the original language it was written in.

The purpose of Git is to manage a set of files that belong to a project. As the project is developed, its files change over time. Git tracks these changes and stores them in a repository, which is a typical data structure (it can handle large amounts of easy-to-retrieve data). If the user dislikes a change or a set of changes made in the project, he can use Git to rollback those changes.

For example, if you were working on a project in Python, Git would take a snapshot of your source code at regular intervals. If you don’t like your recent coding, you can use Git to revert to an earlier state in the project.

What is a Git Repository?

A Git repository contains a set of files, and is itself a file that is stored in a subdirectory (.git) alongside the files of the project. There is no central repository that is considered to be the main repository, like in other software systems. At any given time, there exist several different repositories that are a snapshot of the project you are currently working on, and they are all given different version names. You can learn more about Git basics in this course.

A user can choose to copy (clone) and even switch between different versions using Git. The lack of a central repository makes Git a “distributed” version control system.

The sets of files a repository stores are actually commit objects and a set of references to those commit objects. These references are known as heads. These commit objects are the main core of the repository- they mirror your project and you use them to revert back. A commit object will have a unique SHAI name that makes it possible to identify it. It will also contain references to point to parent commit objects.

Every repository has a master head, and each repository can contain several heads. An active head is highlighted in uppercase letters while an inactive head is highlighted in lowercase letters.

Git for Python: GitPython

You can use Git with Python through the GitPython library. The GitPython library lets you react will high-level as well as low-level Git repositories. You can install the latest version of the software by typing:

easy_install  gitpython

Alternatively, you can download it directly from here. To learn more about the structure of Python and Python libraries, we recommend you sign up for this beginners Python course.

Creating a Repository

You can use Git commands directly to create a repository:

mkdir directoryname
cd directoryname
git init

This will create an empty repository in which you can add files in the specified directory.

Using GitPython

GitPython lets you create objects that let you access your repositories. You can use an object model access to find commit objects, tree objects and blob objects. GitPython also has other features, like letting you gzip or tar objects, return stats and show information logs.

We’ll show you a few basic commands that will help you create objects using GitPython. Please note that these commands are in no way comprehensive – you will need a thorough understanding of the Git software and Python to use GitPython to its fullest capacity. The architecture of the machine you’re on, available system resources as well as the network bandwidth you have access to will also influence how well you can utilize GitPython.

Initializing a Repository Object:

from git import *
repo = Repo (“/path 1/path 2 /path 3”)

This command creates a Repository object in your repository (directory path). You can use the repository object to find commit objects, trees and blobs. To find the commit objects present in your repository, type the following command:

repo.commits ()

This brings up a list of commit objects (upto 10). You can specify which branches it can reach by inputting advanced commands. You can further retrieve tree objects and blob objects with GitPython.

If you want a list of all possible usable commands, check out the GitPython source code here .  To learn more about writing your own Python programs, you can take this course. And if you want to do something more fun than GitPython, try writing your own games in Python, with the help of this course.