Fossil Book

Documentation
Login

Why source control

What is source control?

A source control system is software that manages the files in a project. A project (like this book or a software application) usually has a number of files. These files in turn are managed by organizing them in directories and sub-directories. At any particular time this set of files in their present edited state make up a working copy of the project at the latest version. If other people are working the same project you would give them your current working copy to start with. As they find and fix problems, their working copy will become different from the one that you have (it will be in a different version). For you to be able to continue where they left off, you will need some mechanism to update your working copy of the files to the latest version available in the team. As soon as you have updated your files, this new version of the project goes through the same cycle again. Most likely the current version will be identified with a code, this is why books have editions and software has versions.

Software developers on large projects with multiple developers could see this cycle and realized they needed a tool to control the changes. With multiple developers sometimes the same file would be edited by two different people changing it in different ways and records of what got changed would be lost. It was hard to bring out a new release of the software and be sure that all the work of all team members to fix bugs and write enhancements were included.

A tool called Source Code Control System (SCCS) was developed at Bell Labs in 1972 to track changes in files. It would remember each of the changes made to a file and store comments about why this was done. It also limited who could edit the file so conflicting edits would not be done.

This was important but developers could see more was needed. They needed to be able to save the state of all the files in a project and give it a name (i.e., Release 3.14). As software projects mature you will have a released version of the software being used as well as bug reports written against it, while the next release of the software is being developed adding new features. The source control system would have to handle what are now called branches. One branch name for example "Version 1" is released but continues to have fixes added to create Version 1.1, 1.2, etc. At the same time you also have another branch which named "Version 2" with new features added under construction.

In 1986 the open source Concurrent Version Control System (CVS) was developed. This system could label groups of files and allow multiple branches (i.e. versions) simultaneously. There have been many other systems developed since then, some open source and some proprietary.

Fossil, which was originally released in 2006, is an easy to install version control system that also includes a ticketing system, a wiki and self hosted web server.

Why version the files?

Why do you want to use a source control system? You won't be able to create files, delete files, or move files between directories at random. Making changes in your code becomes a check list of steps that must be followed carefully.

With all those hassles, why do it? One of the most horrible feelings as a developer is the "It worked yesterday" syndrome. That is, you had code that worked just fine and now it doesn't. If you work on only one document, it is conceivable that you saved a copy under a different name (perhaps with a date) that you could go back to. But if you did not, you doubtless feel helpless at your inability to get back to working code. With a source control system and careful adherence to procedures you can just go back in time and get yesterday's code, or the code from one hour ago, or from last month. After you have done this, starting from known good code, you can figure out what happened.

Having a source control system also gives you the freedom to experiment, "let's try that radical new technique", and if it doesn't work then it's easy to go back to the previous state.

The rest of this book is a user manual for the Fossil version control system that does code management and much much more. It runs on multiple OS's and is free and open source software (FOSS). It is simple to install as it has only one executable and the repositories it creates are a single file that is easy to back up and are usually only 50% the size of the original source.

How to get it

If this has interested you then you can get a copy of the Fossil executable here: Download. There are links to Linux, Mac, and Windows executable on this page. The source code is also available if you want or need to compile it yourself. This web site, containing the content you are reading, is self-hosted by Fossil (see section Multiple User).

Source control description

This next section is useful if you have not used source control systems before. I will define some of the vocabulary and explain the basic ideas of source control.

Check out systems

When describing the older source control systems, like SCCS, I said it managed the changes for a single file and also prevented multiple people from working on the same file at the same time. This is representative of a whole class of source control systems. In these you have the idea of "checking-out" a file so you can edit it, while becoming the current "owner". At the same time, while other people using the system can see who is working on the file, they are prevented from touching it. They can get a read-only copy so they can say build software but only the "owner" can edit it. When done editing the "owner" checks it back in, after which anyone else could work on on it. At the same time the system has recorded who had it and the changes made to it.

This system works well in small groups with real time communication. A common problem is that a file is checked out by some one else and you have to make a change in it. In a small group setting, just a shout over the cube wall will solve the problem.

Merge systems

In systems represented by CVS or Subversion the barrier is not getting a file to work on but putting it back under version control. In these systems you pull the source code files to a working directory in your area. Then you edit these files, making necessary changes. When done you commit or check them back into the repository. At this point they are back under version control and the system knows the changes from the last version to this version.

This gets around the problem mentioned above when others are blocked from working on a file. You now have the opposite problem in that more than one person can edit the same file and make changes. This is handled by the check-in process. There only one person at a time may check in a file. That being the case, the system checks the file and if there are changes in the repository file that are NOT in the one to be checked in the check in process stops. The system will ask if the user wants to merge these changes into his copy, after correcting manually for areas of overlapping change. Once that is done the new version of the file can be checked in.

This type of system is used on large projects like the Linux kernel or other systems where you have a large number of geographically distributed contributors.

Distributed systems

The representatives of two major systems we have described thus far are centralized. That is there is only one repository on a single server. When you get a copy of the files or check in files, it all goes to just one place. These work and can support many, many users. A distributed system is an extension of this where it allows the repositories to be duplicated and has mechanisms to synchronize them.

With a single server, users of the repository must be able to connect to it to get updates and to check in new code. If you are not connected to the server you are stuck and cannot continue working. Distributed systems allow you to have your own copy of the repository to continue working and, when back in communication, to synchronize with the server. This is very useful where people take their work home and cannot access the company network. Each person can have a copy of the repository, continue working, and re-sync upon return to the office.

Common terms

The following is a list of terms I will use when talking about version control or Fossil.