Alexander Makarov

About The Author

Alexander Makarov Alexander Makarov is a professional Web developer in Russia. He is also the person behind RMCreative, a Russian blog dedicated to Web developers, designers and everyone interested in how the Web is built.

SVN Server Admin Issue: Fix It!

A few months ago, Anton Isaykin in collaboration with the company 2comrades discovered a serious security problem that is quite typical of big projects (we do not name names here). To test it, they obtained the file structures and even the source code of about 3320 Russian websites and some major English-language websites. Serious problems like this aren't supposed to exist nowadays. Every...

A few months ago, Anton Isaykin in collaboration with the company 2comrades discovered a serious security problem that is quite typical of big projects (we do not name names here). To test it, they obtained the file structures and even the source code of about 3320 Russian websites and some major English-language websites. Serious problems like this aren't supposed to exist nowadays. Every serious or visible exploit is found and fixed quickly. But here we will show you something simple and ordinary yet quite dangerous.

What was found is not actually a vulnerability because it's documented. What we really wanted to show is that major websites and even unique services are affected (SM can't list them, sorry). That shows again that bad developer habits is the most dangerous vulnerability we can imagine.

What Is It?

Almost every developer has used or is using a version control system such as SVN. SVN is an advanced tool for managing source code and is used by teams consisting of anywhere from two to hundreds of developers. In its architecture, SVN stores some meta data in a hidden sub-directory (called .svn) of every directory. One of the files in there, named entries, is a list of all of the files and directories contained in the folder where .svn is located.

alistapart.com source code

It also has a link to the repository itself, developer log-ins, file sizes and dates. That's a problem right there, isn't it? So, if a project was developed using SVN, we could go to draftcopy.ru/.svn/entries and see the project's root file structure, with all of this information.

And we could go even further. In the same .svn directory are some text-base directories containing the latest versions of all project files. Moreover, these files carry the non-standard extension .svn-base (for example, index.php.svn-base). So, the files are not run in PHP, Ruby, Python or Perl but are displayed outright!

http://draftcopy.ru/.svn/text-base/index.php.svn-base

We should note that not all websites use SVN this way. We were not able to get the source code in every case.

When we realized that this problem has persisted for almost nine years, we decided to create a crawler to check websites with Russian top-level domains and major .com websites to collect some statistics. But before we report this, let's go over how to prevent such a thing from happening to your own project.

How To Defend Yourself

You can solve the problem in different ways. The simplest solution is to deny access to SVN meta data directories from port 80 using a Web server configuration.

Solution for nginx

location ~ /.svn/ {
    deny all;
}

nginx has no global location support, so we have to apply this solution to every server section. For the solution to stick, you have to apply it before any other locations with regular expressions. In most cases, you can use the first location.

Solution for Apache

<Directory ~ ".*.svn">
    Order allow,deny
    Deny from all
    Satisfy All
</Directory>

Apache is simpler. Just add the lines above to httpd.conf, which will secure all of your projects.

Solution Using SVN

The cure for the problem is to use the Web server. Every doctor will tell you that prevention is easier and cheaper than treatment. The best solution, then, is to not let .svn in your Web root. To do this, use svn export. It's a common good developer's practice, but apparently in many cases some developers do not follow it.

svn console

Let The Robot Do The Work

As we said, we decided to check the Russian-language Internet for this problem. We established proxy servers, developed a crawler and got a list of .ru domains.

The first crawler version ran for two weeks, getting websites one by one in a thread. When it was finished, we found about 3000 websites affected and had about 100 GB of source code. The problem was that the crawler downloaded every resource, even if the HTTP response code was 500 and not 200, including images and JavaScript. Also some servers responded with a 200 code even when files were not actually there.

The second version was faster. It was multi-threaded and launched from two servers. Also it could work with HTTP response codes and check meta data syntax. This time, the entire .ru zone was covered in four days. Next, we wanted to check .com, but that would have taken about two years with our resources (there are more than 700,000,000 .com domains, compared to only 2,000,000 .ru domains).

apache.org source code

So, we partnered with a good C developer, Andrey Saterenko, who implemented a really fast daemon that could do the job 200% faster. Unfortunately, the summer ended and we had jobs to do. We decided not to check the entire .com domain. So, we picked the top websites based on Alexa statistics and threw in some famous websites that we really like.

We had to alert the developers involved in all of these affected projects before we published this article. We first sent letters to the major Russian services, such as Yandex, Rambler.ru, Mail.ru, Opera.com, Rbk.ru, 003.ru, Bolero.ru and habrahabr.ru. Then the remaining 3000+ Russian websites received their letters. After that, we sent emails to the top .com websites.

Some Numbers

  • Domains checked: 2,253,388.
  • Projects affected: 3332.

php.net source code

We have no detailed statistics on how many projects have been fixed since our report. Perhaps we'll publish that information in two weeks. We received replies from six major Russian projects. One .com project sent us thanks. We got an email from Wikimedia, and Rasmus Lerdorf of PHP.net emailed us (immediately!). Both projects are open source, so their source code couldn't be "stolen," but we emailed them just in case. Ten projects ignored our emails, and four fixed the problem without replying.

Fun fact: approximately 10 websites with "hack" or "secure" in their domain name are actually not secure.

Credentials

All of the source code was printed and then burned. Don't ask us to sell it or publish it. We don't have it anymore. Please check if your favorite website is affected. If it is, write a letter to its support team, with a link to this article. If this article has helped you find and fix the problem, please send an email to sam [at] rmcreative [dot] ru. We'll be glad to read it.

About the authors

The original Russian article was written by Anton Isaykin in collaboration with the company 2comrades that specializes in Web project analytics, development and support.

Anton Isaykin is professional PHP/Python developer in Russia who specializes in high-load projects and architecture. Translated and adapted by Alexander Makarov, professional Russian Web developer, who is behind RMCreative, a Russian blog dedicated to Web developers, designers and everyone interested in how the Web is built.

Smashing Editorial (al)

More Articles on

How To Find Time For... Everything!

by Cameron Chapman

Time management is one of the most important skills a freelance worker can learn. With a good time management system you can easily find the time to do the things that are important to you, whether in your professional or personal life. Successful time management can be challenging, especially to those who are new to freelancing or being self-employed. When you have a boss telling you what to...

Read more

Professional Team Management Tips For Creative Folks

by Andy Butterworth

Management is a vast subject, with several sub-categories, such as product, team management and project management. While all are interesting topics, this article focuses mainly on team management and offers some useful tips and ideas to promote discussion and help improve the performance and happiness of your teams. There seems to be in creative sectors a fear of management and a great...

Read more

10 Tools For Finding, Registering And Managing Domain Names

by Timo Reitnauer

A domain name is the starting point for our online brands and identities, be it for a company, online application or a personal website. Some of us may own only one domain for a portfolio site or blog, whilst others have to manage domain names for clients or all kinds of projects. But even if you do not yet have a personal domain, as the Internet becomes more pervasive in our lives, finding...

Read more