A Response to "List of Possible
Weaknesses in
Systems to Circumvent Internet Censorship"
By Paul Baranowski (paul at paulbaranowski.org)
November 11th, 2002
This paper is a response to the paper "List of Possible Weaknesses in Systems to Circumvent Internet Censorship" (LOWISTIC) by Bennett Haselton. That paper has started a much needed discussion on the possible attacks against anti-censorship technology. This paper is a response to the arguments presented in the LOPWISTCIC paper (dated 11/7/2002) and hopes to continue the discussion.
This author would like to
emphatically thank Mr. Haselton for his paper. There are many people
that share his concerns. To date there have been few, if any, papers
that describe the counter-arguments to circumvention technology so
precisely, though there has been plenty of discussion about it. This
response hopes to reshape the current thinking about circumvention
technologies.
LOPWISTCIC
is arranged as follows: general arguments and attacks against
circumvention technology, followed by an overview of current systems
and arguments as to the threats against them. The first part of the
article tries to establish a certain set of principles that are then
used to attack the current crop of anti-censorship technology. Thus the
second half of the article relies almost entirely on the first half of
the article holding up to scrutiny. This paper debunks the arguments
made in the first half of the original paper, and thus, for brevity,
the responses to the second half are ommitted.
All non-italicized statements are
the words of Paul Baranowski, and the statements in italic are the
words of Bennett Haselton. A horizontal line is placed between each
argument/counter-argument.
List
of Possible Weaknesses in Systems to Circumvent Internet
Censorship
Bennett Haselton
First draft completed 11/7/2002; continually updated
A wide variety of systems, including programs by the names of Triangle Boy, Peek-A-Booty, Six/Four, and CGIProxy, have been proposed for circumventing Internet censorship in countries such as China and Saudi Arabia, with no clear winner emerging as the single best anti-censorship solution.
One reason is that there hasn't been much
discussion about how well these
systems would hold up in
response to various types of attacks that could be mounted by the censors.
The worst thing that could happen
would be for an
anti-censorship system to be widely deployed, with volunteers all over the world
running software to assist in the effort and people in China and other
censored countries using the software
every day to beat
censorship, when suddenly the censors find a flaw that can undermine and block
the whole system.
One of the arenas of battle that we enter in the anti-censorship wars is the popular awareness of censorship itself. Many people are not aware of the censorship occurring in places like China and Saudi Arabia. The more people know about it, the more pressure will be on all parties to do something. Thus if there are thousands of people volunteering their machines all over the world to the cause of anti-censorship, this implies that all of those people will be aware that Internet censorship is occurring. This is a victory for the side of free speech, no matter what eventually happens to the network. Finding a flaw in a deployed system also works in favor of raising the issue to the general public. A flaw like this would most likely be widely publicized in the media, and would continue to mount pressure against censorship. We must also keep in mind that if a flaw is discovered and exploited by a particular country, it does not render the anti-censorship software useless. The software would still work in all the other countries that have not taken advantage of the weakness.
Even if the flaw is publicized, it still remains for these other countries to exploit it – something that most likely requires resources (such as educated computer scientists) working on the problem. This taxes the censoring party in manpower and money.
This is a key point that will be expanded upon later.
First let us confront the hype: there has been a single case of someone
downloading dissident information and being arrested, out of 33 million
Internet users. This was not the only charge however; he was also in
contact with other dissidents overseas which is how he must have been
noticed in the first place. (For details see: http://www.inq7.net/inf/2002/aug/09/inf_3-1.htm)
The various censor-resistant
systems have tens of thousands of users per day, and none of these
users have yet been arrested. In authoritarian countries like China, if
the government doesn’t like you, they arrest you and throw you in jail.
But they first have to not like you. Simply knowing that two computers
had a connection between each other does not give them enough reason to
not like you when there are so many more important things to deal with.
With that said, I think that all possible precautions should be taken with anticensorship systems in order to protect the user. However, this is mostly a matter of user education. Each user must be informed of the risks and be willing to accept them before they use the system. It is very similar to signing up for the military – when you sign up, you know the consequences and you fully accept them. You sign up because the values you hold in your heart are more important than the risk.
Plus, if the traffic is detectable, the
censors can also trace it to the sites outside their country which are
helping defeat Internet censorship,
and add those sites to a
permanent blacklist. Even if those sites later upgrade to a more secure,
undetectable version of the software,
they will still be blacklisted, and it may be prohibitive for them to move to a new location to get
around the blacklist.
The more sites that a country blacklists, the more pressure they put on themselves. With each site that is blocked, the former users of that site get a little more unsettled. Perhaps a little less commerce flows between the two countries. The more sites that have dual-uses, such as a business site running a proxy server, the more economic pressure the country feels by blocking that site. From this comes the first law of anti-censorship technology: the more the anti-censorship technology is tied to commerce, the harder it is to block. Making connections with SSL is an example of this principle. An example of extending this law would be to get circumvention software running on as many business sites as possible.
So, it is a high priority to think of
possible attacks against a system before the system is deployed.
The exact opposite of this statement is the strategy of Dynaweb, a company that distributes proxy IP addresses to whoever visits their web page. Their strategy is to try something until it is blocked by China, and then try a new method until that is blocked, and so on until their solution is complex enough that China will not block it anymore. This is the second law of anti-censorship technology: a circumvention technology will be blocked up to the limit of resources the enemy is willing to invest in blocking the technology. This is related to the third law of anti-censorship systems: the amount of effort an enemy will put in to defeating a system is proportional to the users using that system. It should be noted that none of Dynaweb’s users have encountered any sort of retribution.
General Issues
The Censorware Designers Can Always Make the Last Move
When designing a circumvention system, it's not enough to design it so that it can get around the existing censorship methods already in place. It should also not be possible for the censors to defeat the circumvention system by making an easy change to their censorship architecture.
As
a trivial example, suppose that a server like Anonymizer.com allows
users in China to retrieve banned pages. (The real Anonymizer site is
blocked, of course, but suppose someone installs similar software on
their own server, which the Chinese censors don't know about.) But
since the Chinese block users from downloading the pages containing the
string "Falun Gong", the software replaces characters in an HTML
document with their HTML character code equivalents, so that "Falun
Gong" might be replaced with "Falun Gong", which a Web
browser will display as "Falun Gong". This will temporarily defeat
China's keyword filtering. But if this method becomes widely known (as
it would have to be, in order to be useful to a significant
proportion of Internet users in China), the Chinese censors can modify
their software to take HTML character codes into account when scanning
for banned strings.
The authors of the circumvention system
might be prepared to engage in
an "arms race" with the censors,
where each time the censors find a way to detect and block the last
version of their software, the authors release a new version that avoids the
old weakness. But if the circumvention
scheme depends on
volunteers all over the world running
the software on their
computers or their Web sites (as all of the existing proposed systems do),
then each time a new version is released, all volunteers will have to
upgrade, which may take more ongoing
time and effort than they want to give. If software is required on the end-user's computer inside the
censored country, then they will have to upgrade as well -- and if the
circumvention system that they were
using before is now blocked and
made obsolete, they may not even
be able to obtain the upgrade
without a lot of effort. Worst of all, each time the censors get ahead in
the arms race by finding a way to detect and trace the circumvention
traffic, they can detect and punish anyone within their country that they
catch using the software -- and those are the users who pay the
biggest price for the "arms race".
Why is it "bad" for an
anti-censorship application to suddenly stop working? The argument
given proposes that if someone uses the software and it is subsequently
blocked, it is worse than never using the software in the first place.
Yet for the amount of time that the system was working, the individual
was getting uncensored information – and this is absolutely priceless.
Not only that, they now have experience with the software and if the
program becomes available again they can use it with ease because they
already have experience with a previous version.
A separate argument is made above saying that being able to block a circumvention technique also allows detection of who is using the technique. This supposition is false. Blocking a technique and detecting and logging the users of a circumvention technique are entirely different processes. For example, Dynaweb used a SSL certificate to encrypt traffic to its web site. China responded by revoking that certificate, thus temporarily blocking their server from within China. It would be an entirely separate process to keep track of those people who were trying to access Dynaweb’s server. Blocking the traffic in no way requires record keeping. Logging any such network activity would require vast amounts of data storage and data mining capabilities and would provide very little bang for the buck for the censoring party.
When deploying a circumvention system, you
can never be completely certain
that there is no way to
detect and block it. But for the reasons described above, it's irresponsible
to deploy a system where you know of an easy way to detect it and it's
only a matter of time before the censors figure it out. The best you
can do is release a circumvention system that doesn't have any known
fatal weaknesses.
The fatal weakness in the above
argument is the line "…it’s only a matter of time before the censors
figure it out." This assumption turns out to be false. The proof is by
contradiction, so we assume the original statement to be true. This
implies that the censor will always be able to, at some time in the
future, upgrade their censor system to defeat any new anti-censorship
technology that comes along, which in turn implies they always have the
resources (money, manpower, etc) to deal with every new anti-censorship
technology. With this reasoning, 1,000,000 different anticensorship
systems could be deployed every year and a country like Cuba would be
able to keep up and block them all.
The reality is which side is
willing to invest more. If the rate of new anti-censorship systems is
higher than the ability of China to block them, then anti-censorship
wins. Otherwise, censorship wins. A good analogy would be the
Cold War with Russia. The U.S. was able to outspend the U.S.S.R.
on its military which eventually led to a victory for the United States.
The next statement: "it's irresponsible to deploy a system where you know of an easy way to detect it" has as its basis two assumptions, 1) detection is equivalent to arrest, and 2) if it’s easy to detect, it will be detected and blocked. Again, these assumptions are wrong for the same reasons – we must take into account the resources of the opponent. If there is a weakness that is never exploited, it is not a weakness.
The "Human Shield" Fallacy
This goes something like, "We built an
anti-censorship system that hides
secret traffic in ICQ messages.
The Chinese won't dare to block ICQ -- it's a valuable tool that
increases international understanding and friendship among nations, and
besides, blocking it would violate RFC 9,234,436." Even if the Chinese
censors aren't block ICQ now, if ICQ became the most popular means of
circumventing their government's
censorship, it would likely be blocked very quickly.
Here the author implies the third law of anti-censorship, that is, the amount of effort an enemy will put in to defeating a system is proportional to the users using that system. However, in many cases the first law works against them here: if the service blocked is being used by businesses, the harder it will be for them to block. Blocking the service will have the negative consequence of stifling the economy. We saw this happen in China when they blocked Google. Many businesses complained to the government because they relied on Google for their day-to-day activities. Access to the site was quickly restored.
The
only protocols that the Chinese would probably never block, at least
not without rendering the Internet essentially useless for the Chinese,
would be Web traffic and email. The Chinese government must believe
that Internet access provides some benefit to their country, or they
wouldn't allow it at all, and blocking Web traffic or email would have
a staggering impact. But any other protocol would probably get blocked
very easily if it were widely used as a means of covertly sneaking
around the firewall.
Assuming That Censors Lack the Resources To Monitor All Traffic Effectively
First
of all, this only applies to "monitoring" algorithms that are
processor-intensive or require traffic to be stored somewhere where it
can be analyzed. Simply blocking access to a Web site at the Great
Chinese Firewall is trivial, and the Chinese can block as many sites as
they want. But if they wanted to, for example, block pages that a user
visits 90% of the time right after being denied access to a blocked
site, that would require some storage of usage history patterns.
Moore's Law says that the amount of
computing power you can buy for a fixed cost doubles every 18 months.
The number of Internet users in any given country can't grow that
fast (it would quickly exceed the total number of people in the
country), so the amount of computing power available to monitor any
individual user's Internet traffic will essentially double every 18 months as
well. In order for a circumvention
system to stand the
test of time, it should take into account the potential increase in
governments' power to monitor traffic.
While processor power doubles every 18 months, Nielson’s Law states that available bandwidth grows at 50% per annum. This means that over the same 18 months that processing power has doubled, bandwidth has more than tripled. The number of users plays no part in this, as the amount of used bandwidth increases with the amount of available bandwidth. Thus, over time, the ability to monitor traffic actually becomes harder and harder.
In
the meantime, even if a country can't monitor all traffic effectively,
it could decide to only monitor, say, 5% of its users at any given
time. If the censors can spot circumvention traffic and trace it
back to specific users within their country, then each user would have
an unacceptably high 1-in-20 chance of being caught each time they used
the circumvention protocol. Even if the act of circumvention could not
be traced back to a specific user within the country, the circumvention
site outside the country would be permanently blacklisted so that it
could not be used in the future.
Also,
different countries have different abilities to monitor and censor the
Internet traffic over their networks. China uses centralized
filtering at their national border, blocking traffic to specific sites,
and only
in late 2002 did they begin more fine-grained blocking, such as
blocking traffic containing certain keywords. Many Chinese users also
access the Internet through public cyber cafes, where violations would
be virtually impossible to trace back to a specific individual.
(Although
as of October 2002, the Chinese are now requiring the use of ID cards
to sign on to the Internet in licensed Internet cafes, so that an
attempt
to access a blocked site can be traced back to the individual user --
but unlicensed cafes still thrive and would not be bound by the new
rule.) Saudi Arabia, on the other hand, uses a network of proxy
servers supplied by SmartFilter, which allows more flexible blocking of
Web access (specific keywords, URLs, search patterns, default blocking
of sites accessed by IP address, etc.), and any suspicious activity can
be traced back to an individual user's ISP account. The Saudi filtering
system is also distributed across multiple proxy servers, allowing for
more sophisticated, processor-intensive analysis of users' usage
patterns.
So it would seem that since China's
filtering is much less sophisticated than Saudi Arabia's, a system could
be deployed that would be secure enough to go undetected in China but
not in Saudi Arabia. The problem
is that once the system is
released and gains a reputation for helping to circumvent Internet
censorship in China, it would be very hard to stop users in Saudi Arabia
from using the same system if it were at all possible for them to
obtain it. Once you open the floodgates,
users are unlikely to
understand the nuances of why a system would be safe to use in one network
architecture but not another.
Users are only likely to misunderstand the safety of a particular system in the case where they are not told where it would be safe to use a particular system. This is simply a user education problem, and a general statement like this is simply a red herring.
Traffic-Flow Analysis
If the censors can track the history of
Internet accesses by user or by IP address, one thing they can do is
watch what site a person usually connects to, immediately after being
denied access to a blocked site. Whatever it is that the user does to
get around a block -- whether visiting
a particular Web site, or
sending email to a certain address, or going on a chat network -- it will
look suspicious if they always do it right after being blocked from
something.
The technique proposed has many technical assumptions built into it about how the circumvention technology will work. The first assumption is that the user will be visiting a certain web site for their circumvention needs. As the author himself pointed out before, the circumvention technique may not be via the web, but possibly another service such as ICQ. The second assumption is that a user will be the one requesting web pages, and not an automated script. The proposed system would generate thousands of false positives, which somehow must all be filtered through by hand. The third assumption is that only banned web pages will be fetched via the circumvention technique. Most current anti-censorship systems route ALL data through the anti-censorship software, thus making the proposed technique powerless. Finally, the author provides no hard data on the browsing habits of users to support his assumption that they would browse to a circumvention web page immediately after getting blocked. Without this key data, the rest of this section is rendered moot.
The
safest way to defeat this detection would be to educate users -- tell
them not to always go to a circumventor site immediately after being
denied access to a blocked site. Unfortunately, it's notoriously
difficult to get software users to follow any guidelines that are not
actually enforced by the software. It would be better to display a
warning that the user always sees when using the circumvention program.
If
the circumvention method is to connect to a "circumventor" Web site
where the user types in the URL of a page that they want to see, then
the page itself could contain a warning: "Do not always visit this page
right after being denied access to a blocked site". Of course, by that
point it's too late, since the connection to the circumventor site has
already been made. If the circumvention method uses software on the
user's side (which connects to a server running somewhere outside the
censored regime), this is safer because the software itself can be
configured to display the message before the connection is initiated.
Also, the more outside circumventor server that the user knows about (or, the more circumventor servers are being stored by their client software), the fewer times the user will need to visit each one. If the user simply knows about a list of several circumventor Web sites, then unfortunately they will probably connect to the same one each time they want to view a blocked page, just out of habit. If they are using client-side software, though, the software can be configured to properly rotate through the available servers. But in both cases, the real problem with multiple circumventor servers is that if it's hard enough to make sure each user knows about at least one unblocked server, it's even harder to make sure each user knows about several of them. The more users you distribute each server location to, the greater the chance that the censors will find it, block it, and then track down anybody who attempts to connect to it.
Using Steganography To Hide Data Inside "Noise"
In
any communication channel, "noise" can be considered the extra data
being transmitted that isn't relevant to the information being sent.
The most common example is the static "noise" on a radio communication
line, but the random graininess in an image could be considered "noise"
as well. In fact, steganography is usually discussed in terms of hiding
data inside an image by changing the least significant bits
representing the color of each pixel. For example, if
you have an image measuring 100 x 100 pixels, and the image is saved in
24-bit color so that each pixel has a red, green, and blue value
represented by an 8-bit number, then you could alter each of the 30,000
"least significant" bits to store a 3,725-byte message, without
drastically changing the appearance of the image.
The problem with using these schemes to
transmit information through a
censored Web proxy, is that once
this method becomes widely known,
the censors can simply change
random color bits in each downloaded
image. (This would
probably not be feasible at the level of the Chinese firewall, because the
censoring software would have to reassemble the packets representing each image,
then obtain an internal representation
of the image in terms
of its pixels and colors, change the pixels, convert the image back
into raw bytes, and send them out again. But it would be feasible for a
censoring proxy such as the SmartFilter
proxies used in Saudi
Arabia.) The censors wouldn't change enough pixels to annoy normal users,
but enough to defeat any encoding
scheme that transmitted
information using the least- significant
color bits. (Many users
probably don't care much about crisp image quality unless they're
downloading pornography, which most censorious regimes are blocking
anyway.)
There are a number of reasons
that
"simply" changing random color bits in each image would not be a
feasible answer to the steganography problem:
1) TCP/IP breaks. Each packet
sent
on the Internet has a checksum associated with it which validates
whether the data received is actually the data sent. By introducing
this "noise", the checksum will not match the data, and the TCP
connection will eventually fail. The censors could potentially get
around this problem by modifying the TCP/IP checksum of each packet,
but the problem is more generalized than that. Some programs
perform application-level checksums on images, which would break as a
result of modifying the image in this way.
2) This severely invokes the
first
law of anti-censorship. Any business that relied on a certain photo
quality would have a huge problem. Hospitals and other institutions
require images to be of a certain resolution and quality. Would the
people in government want to bet their life on the hopes that flipping
some bits in an image will not lead to an accidental death? How
would you decide the threshold of the noise introduced in each image?
Legitimate uses of steganography might also encounter problems.
Hospitals may encode patient information inside of each image, and
introducing this noise could potentially corrupt that data. This may
sound like a contradiction with the previous point, but in this case
the hospitals can choose the amount of "noise" they introduce into each
picture, not an outside force that has no awareness of the particular
application being used.
3) Comparing two images for
equality becomes non-trivial.
4) The firewall would have to
have
a different technique for every type of steganography out there – there
are systems for MP3’s, document files, IP packets, and many more forms
of media.
5) Most decent steganography
systems nowadays are resistant to introducing noise into the media, up
to and beyond the point of human perception. This means the amount of
noise needed to effectively wipe out a stego-message alters the
original data so much that the modifications are easily
noticeable. Defeating steganography systems is hard and not
likely the route any censoring party would take to detect dissidents.
The snipped part of the paper is
a
separate discussion and deals with very specific issues regarding
current anti-censorship software. For the sake of brevity the response
to the second half of the original paper will not be included here. The
intent is to stay focused on the broader issues before getting bogged
down in the technical details of specific applications. Since most of
the issues in the second section rely on the assumptions and arguments
made in the first section, many readers will be able to apply the laws
of anti-censorship to reason through the problems with the arguments
made in the second section of the original paper. If you are interested
in a response to the second section, please email your request to paul at paulbaranowski.org.
Conclusion
This paper hopes to debunk many
of
the common myths and assumptions surrounding anti-censorship
technology. We have seen that anti-censorship does not deal with purely
technical issues; rather it is tied very directly to other factors such
as economic, social, and political elements. All of these factors must
be considered when developing a global strategy.
We have seen that the fears of
arrest in China are overblown and that no one in China has been
arrested on the lone charge of downloading banned information.
Simply blocking the use of a particular technology is so much easier
than tracking down the users of the technology that detection is not
performed. The paralyzing fear that grips us when we think that an
application we might develop could get a user thrown in jail has been
exaggerated by the hype and the media. That fear comes from not
knowing the hard facts behind these cases. Now that we have seen that
no one has been arrested for using circumvention software alone, we see
that that fear is unrealistic and overblown. Any trepidation we might
have can be regulated mainly through user education.
The assumption made in many
encryption books – that the enemy has infinite resources – does not
apply in the real world. The enemy does not have infinite resources.
This assumption, which seemed to be lying below the surface in the LOPWISTCIC
paper, is only meant to be made in regards to encryption and
steganographic algorithms. It does not take into account larger
strategies. The censorship war will be won by the side which is willing
to invest more resources.
The popular awareness of
censorship
is a major battlefront in this war. Credibility is established in part
by the number of people that support a particular cause, and the number
of people supporting this cause increases with the knowledge of the
problem. The greater the popular support, the more likely
financial resources will be devoted to the problem.
This paper introduced the laws of
anti-censorship:
1) The difficulty of blocking an
anti-censorship technology is proportional to its ties to commerce.
2) A circumvention technology
will
be blocked up to the limit of resources the enemy is willing to invest
in blocking the technology.
3) The amount of effort an enemy will put in to defeating a system is proportional to the users using that system.
The second and third laws of
anti-censorship imply that the best way to win the war is to have as
many different systems as possible, thus maximally dividing the
resources of the opponent. A heterogeneous approach would be the most
resistant to attack, very similar to having different strains of crop
to avoid a particular disease destroying all of your food supply.
Heterogeneous in this case means using a different technology for each
application so that each one would require a different countermeasure.
For example, this approach could use different transport mechanisms in
each application such as AIM, IRC, HTTP, SSH, and email; and use a
different encryption or steganographic system for each one. Whereas in
most other applications interoperability is viewed as a good thing, in
the world of anticensorship, software diversity is a good thing. A
multi-pronged attack forces the opponent to fight defensively on many
fronts.
As Edmund Burke said, "All that’s
necessary for the forces of evil to win in the world is for enough good
men to do nothing." If we are opposed to oppression, we must act. We
cannot sit back in fear and hope that everything will work out.
We can win this war, but only if we make the commitment necessary to
make it happen. There is still time to do something, but our window of
opportunity shrinks every day. We must act now.