Improve Your Anonymity by Modifying Your HTTP Headers

June 6, 2008 – 2:33 pm

Using TOR or proxies just isn’t enough, because a peek at your HTTP headers will partially reveal your identity.
HTTP Headers reveals these details on every user:

- IP
- Remote Port
- Host
- Browser (User Agent)
- Accepted Language
- Cookies Enabled/Disabled
- Javascript Enabled/Disabled
- Screen Resolution
- Operating System
- Java Enabled/Disabled
- Anti-Aliased Fonts Enabled/Disabled
- Color Depth
- Number of Colors
- Pages in Browser History
- Locale

For this post we will be focusing on the bolded objects.

Browser - User Agent

Every browser has its own User-Agent. The user-agent is used to tell websites which browser I am using.
Some websites renders pages according to the client browser (to provide cross browser compatibility).
Another important detail that the User-Agent provides is the language of the browser. Firefox 3.0 (English version) user-agent looks like this:
Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9) Gecko/2008052906 Firefox/3.0

If you would to install the Spanish version of firefox, the user-agent will look like this:
Mozilla/5.0 (Windows; U; Windows NT 5.1; es-ES; rv:1.9) Gecko/2008052906 Firefox/3.0

Accepted Language

The accepted language header is your default input language which is set in the windows regional settings.
Control Panel -> Regional and Language Options -> Languages -> Details -> Default Input Language

Again, if this option is set to Spanish, the header that will be sent to the website is “Accept-Language: es-ES”

Locale

This header is also taken from the windows regional settings:
Control Panel -> Regional and Language Options -> Location
This header determines the country you live in.

All of these headers will reveal your real location (if set correctly).
If you are using a proxy, match the details of the proxy to the details listed in the HTTP Headers, otherwise it will partially reveal your identity, and expose the fact that you are using a proxy (even if the proxy is completely anonymous).

Here is an example of an obvious use of a proxy:

IP    80.201.243.108
Remote Port    1981
Host    108.243-201-80.adsl-static.isp.belgacom.be
Browser (User Agent)    Mozilla/5.0 (Windows; U; Windows NT 5.1; pt-BR; rv:1.9) Gecko/2008052906 Firefox/3.0
Accepted Language    pt-BR
Cookies     Enabled
Country    —
Javascript     Enabled
Screen Resolution     1024 x 768
Operating System     Win32
Java     Disabled
Anti-Aliased Fonts     Disabled
Color Depth    32 Bits
Number of Colors    4294967296
Pages in Browser History     16
Locale    pt-BR

Clearly, either the visitor has installed everything in portugeese,  or he uses a proxy to hide his IP, but leaves other regional details intact.

If you are using Firefox, use the “Modify Headers” or the “Tamper Data” extensions to edit/hide your HTTP Headers.

Retrieving Meta Data from Documents and Pictures Online

June 5, 2008 – 10:10 pm

Almost every file carries some MetaData along with it.
For example, Microsoft has a technology called OLE (Object Linking and Embedding). This technology, among other things, holds many details (MetaData) regarding the identity of the author.
Details such as:
* Your name
* Your initials
* Your company or organization name
* The name of your computer
* The name of the network server or hard disk where you saved the document
* Other file properties and summary information
* Non-visible portions of embedded OLE objects
* The names of previous document authors
* Document revisions
* Document versions
* Template information
* Hidden text or cells
* Personalized views
* Comments

If you wish to remove these information details from your Microsoft documents, i suggest reading this article

These details could be retrieved from online documents to reveal the identity of a document author.
Like OLE, other files has other standards that also holds information about the computer and author of the file.

In order to retrieve that kind of information from files online, I recommend using Serversniff File-Info.
It has the ability to retrieve information from over 100 file types.

- Images: .gif, .jpg, .raw-formats, .png, .tiff and many many others
- Video: .avi, .mpg, .mov and many others
- Audio: .mp3, .wav, .ogg and many others
- Documents: .doc, .xls, .ppt, .pdf and many others
- Multimedia: .swf and many others
- Archives: .zip, .rar, .gz, .tar and many others

MetaData could be also retrieved from pictures we find online.
For example, the JPEG file format holds these information details in the MetaData (Which is called EXIF):
* model of the digital camera
* time and date the picture was taken
* distance the camera was focused at
* location information (GPS - Only if the camera has a GPS antenna) where the picture was taken
* small preview image (thumbnail) of the picture
* firmware version, serial numbers, name and version of the image manipulation program.

If you want to view these details of a specific photo, you can either use Serversniff file info, or an application called ExifViewer

to remove MetaData from the photos you have taken, you can use jhead to manipulate that MetaData.

Visit Websites Anonymously

June 2, 2008 – 4:04 pm

Nowadays almost every search engine keeps cached pages.

It looks like this:

Google Cache

That means that whenever the search engine crawls a web page, it keeps a copy of it and allows the users of the search engine to visit the page from its database, rather than from the original website.

Viewing cached versions of pages will keep you anonymous, because the webmaster will not be able to see that someone loaded a web page from his website, for the reason that you are actually viewing a web page that is hosted on the search engine server.

But be careful, Google and other search engines don’t keep cached versions of Pictures, Javascript or Flash. So if you view a web page that has one of these, it will attempt to load it from the original server. This will reveal your information for the webmaster to see.
Also, every link beyond the actual cached page you visited is addressed to the original website.

If you want to visit a website anonymously using the cached version of the page, remember to disable automatic picture loading, flash and javascript. You can do it manualy in your browser settings.
Firefox has a special extension for doing just that, named Passive Cache, Use it.

Another website you can use for viewing earlier versions of a website is the Internet Archive which also doesn’t cache picture, javascript and flash, But offers multiple cached versions.

WANTED: Content Contributors

May 30, 2008 – 1:59 pm

We are welcoming contributors to help us improve anonwatch.com

If you wish to help, please contact us anonwatch \\@\\ anonwatch.com

TrackMeNot - A Firefox Extension

May 29, 2008 – 8:56 pm

I recently came across TrackMeNot, A firefox extension that helps users stay anonymous.

TrackMeNot, or TMN, helps you conceal your web searches by performing other false searches in the background in a way that no one will be able to view your actual web searches.

Usually these Anti-Surveillance applications use a static dictionary, which has about a thousand words, and sends a query to a search engine every period of time.
These are easy to spot because search engines can identify these applications by marking specific words in its dictionary and use the time factor as another identifier of an ‘anti-surveillance’ application.

TMN is smarter than all other related applications I have seen.
It uses a dynamic dictionary, so every time TMN performs a search it collects all the words from the result page and uses them as a new dictionary for the next search query.
That way, even if someone tried to mark all the words that’s in your dictionary to ignore them and reveal your actual searches, He won’t be able to if you will use TMN.

Another smart mechanism that TMN uses is random timing. This mechanism prevents search engines from identifying the false queries by randomizing the time delay between each search query it sends.

You can read more about it here

Use it. it’s simple and clever.

Everything there is to know about Proxy Servers

May 26, 2008 – 5:47 pm

Proxy servers are devided to three kinds:

HTTP Proxy:

These proxy servers are the most common ones.
There are four kinds of HTTP Proxy servers:

Transparent - The proxy server transfer the visitor request to the target website, but reveals the true identity of the visitor by adding an HTTP Header to the web request called X_FORWARDED_FOR or VIA_PROXY which contains the IP address of the original visitor.
Transparent proxy servers are usually used by companies that wants to allow their employees to access the Internet but wants to cut down bandwidth by caching pages inside the proxy server. So users would reload previously visited pages from the proxy server rather than the website itself. Clearly, the company does not wish to provide its employees anonymous surfing, that is why the proxy server reveals the identity of the original visitor.

Anonymous - This proxy server also adds the X_FORWARDED_FOR or VIA_PROXY field to the HTTP headers, though leaves it empty, So the identity of the original visitor stays anonymous, but the fact that the visitor used a proxy server is revealed.

High Anonymity - This proxy server acts as the original visitor and adds no fields to the HTTP Headers.

Elite - This kind is exactly as “High Anonymity” except for the fact that it works over SSL and provides Encryption between the visitor and the Proxy (but not between the proxy to the target website).

There are millions of HTTP Proxy servers mainly because they are commonly installed on zombie computers that are a part of a Botnet you can find them very easily in hundreds of forums.

Common ports of HTTP Proxy servers are: 80, 8080, 3128.

CGI Proxy Script (CGI/Perl):

These proxy servers transfer the visitor request to the target website using a CGI Script.
It looks like this:

Web Script Proxy
CGI Proxy Script

Once we fill in the website we want to visit into the form, our request will be tunneled through the script to that website.
The URL will look like this:
http://www.bpcd.net/cgi-bin/nph-proxy.cgi/010110A/http/www.google.com/
As you can see, the bold part is the CGI proxy script, and the following is the website we wanted to visit.
The script will also replace every link in the visited website so that every link we will follow inside the website will also be transmitted through the proxy script.
Another plus is that The CGI script also provides cookie management and therefor can enable us to visit websites that requires authentication.

TIP: You can chain CGI proxy scripts like this to gain even more anonymity.

Even though using a CGI Proxy script does not reveal our original identity, It does have one weakness.
The weakness is that we cannot use a Referer (Notice the “Hide referer information” in the screenshot), because doing so will reveal the fact that we are using a CGI proxy.

CGI proxy scripts are very easy to find, You can use the following query on Google:
(filetype:cgi OR filetype:pl) AND (intitle:”start using cgiproxy” OR intitle:”start using cgi proxy”)

CGI Proxy scripts are very stable due to the fact that they are hosted on web servers (rather than zombie computers like http proxy servers), but there is only a few of them. Google finds about a hundred.
CGI Proxy scripts are commonly used by people who wants to bypass web content filters (at Universities, Work, etc..). These filters work by blocking specific domains or IP addresses. By using a CGI proxy script we can bypass the content filter because we access the forbidden domain/ip through the CGI Script which is hosted on a non-blocked domain/ip.

SOCKS Proxy Servers:

A SOCKS Proxy is very similar to an HTTP Proxy. The main differences are that SOCKS transfers data between the original client to the server without interpreting it while an HTTP Proxy does interpret the data.
The other difference is that SOCKS can transfer any kind of data between a client and a server, while an HTTP Proxy can only transfer http data between a client and a server.
The most common port of a SOCKS Proxy is 1080.

Internet Explorer Cache - A Short Review

May 24, 2008 – 11:02 am

When we visit websites online, every page we see, every picture, every Media file and every script we load using Internet Explorer, is saved in a particular directory so that we could load it directly from our computer instead of reloading it online from the website.
This cache can be found in the following path:
C:\Documents and Settings\<username>\Local Settings\Temporary Internet Files\Content.IE5
The entire cache is managed by the file Index.dat, which helps Internet Explorer find these cached pages faster.

The purpose of the cache is to maximize speed by loading the previously visited pages from the cache rather than reloading them online using the website.

Index.dat is usually located at the following path:
C:\Documents and Settings\<username>\Local Settings\Temporary Internet Files\Content.IE5\index.dat

You can view it very easily using Super Winspy (Free)

It looks something like this:
Winspy Example

In order to delete the information found on Index.dat, You simply need to go to:
Tools -> Internet Options -> General -> Browsing History -> Delete -> Temporary Internet Files

TOR in depth

May 20, 2008 – 8:57 pm

Background

TOR (The Onion Router) is a network of Nodes, which can be used to tunnel your connection anonymously.
It’s free, and it’s an open-source project, so everyone can use it.

The TOR network consists two main categories, TOR Servers and TOR users.
any TOR user can be a TOR server. So no one can ‘own’ all the servers.

TOR creates chains of 3 nodes each and changes those chains every few minutes. The chains are created randomly from a Directory of nodes managed by a few main TOR Nodes.

TOR opens a connection between the client and the first node (Usually referred as an EntryNode), then it sends a signal to open a connection to the second node, and then to the third node (Usually referred as an ExitNode).

So when you surf to a website, you are doing so through three different nodes that are located around the world.

You -> Node1 -> Node2 -> Node3 -> www.google.com

TOR relies on Telescoping encryption, which means every node can only see the information it needs to see, rather than having the ability to decrypt the entire data transmitted between the nodes.

Security and Privacy Simulations

Node1
So the owner of the first node can see who is the actual person that wants to surf anonymously, but cannot see the data because it’s encrypted - so Node1 cannot see what i asked for, but can see who I am.

Node2
The owner of the second node can’t really see anything. He doesn’t know who asked for the data, because all of the requests are being tunneled through the first node (which gives no information about the original person that requests the information), and he can’t see the data also, because it’s encrypted.

Node3
The owner of the third node doesn’t know who asked for the information because it’s tunneled through the second node, but he can see that data we have requested because the third node can decrypt the data.
The third node has the key to decrypt the data because it has to communicate with the website we requested (which generally doesn’t use any type of encryption).

Let’s take this a step further,
Node1 + Node2
If I am the owner of the first two nodes, it doesn’t do me any good. because again, I can see the person that wants to be anonymous, but I can’t see any of the data because neither Node1 or Node2 can’t decrypt the data.

Node2 + Node3
Owning Node2 and Node3 isn’t ideal either. I will be able to see the data, but owning Node2 won’t help me with anything.

Node1 + Node3
Owning Node1 and Node3 can be more helpful if I am an attacker who wishes to see the identity of a TOR user.
Node1 knows the identity of the person that requests the information, Node3 knows how to decrypt the data.
Getting the whole picture is still very hard.
Lets say ‘John’ is connected as the following:
Node1 -> Node5 -> Node3
Now, john wants to surf anonymously to google. So we see that Node1 gets a request from John, but we don’t know what john wants. then, we see that someone (we can’t tell it’s john for sure), requested google’s homepage through Node3. Connecting the dots and saying that john is the person that requested google’s homepage is almost impossible considering the fact that there are currently hundreds of thousands of users. The attacker can’t know for sure that john that surfed through Node1 is really the same person that requested a certain page through Node3.
The only way to know both the user identity and the requested data is to use a technique that relies on timing and calculates the average time it takes for someone to connect through Node1 to a second unknown node and then to Node3, and assume that whoever qualifies for the average time is the same user of both of the nodes.

Needless to say if the attacker owns all three nodes, he can see the user identity and the entire data he requested. But it is very unlikely that one person will control all three nodes because the chains are being randomly chosen by the TOR client and there is a reasonable amount of TOR Nodes (approximately 1800)

Now let’s take it a step further, Lets say someone hacked your computer and uses a Packet sniffer and watches every packet that leaves your computer. thanks to the encryption that TOR uses, the only thing the attacker can see is the first node identity. none of the data, not even the identity of the exitnode.

TOR also has a solution to a very known privacy issue that usually doesn’t get any attention by other anonymization applications, and that is the fact that DNS servers has the information on any user that made any DNS request to it. Which means that if i wanted to visit google.com anonymously and used any type of anonymizer, google wouldn’t be able to know my identity by checking the logs of the Web server, but they will be able to know my identity through the logs of their DNS server. because every time you type an address in the navigation bar, in the background a DNS query is being sent to google’s DNS servers (Using your non-anonymous identity).
TOR solves this by using an application called Privoxy that allows DNS queries to be tunneled anonymously through the TOR network.

As you can see, eavesdropping TOR users is quite difficult.

Vulnerabilities

As far as I know, TOR has only two vulnerabilities

  1. The first technique enables a website owner to know the true identity of a visitor that uses TOR by using a special Java applet that sends an ICMP packet (TOR does not support ICMP tunneling) to the website and by that unveils the true identity of the website visitor.
  2. The second method requires the attacker to own an ExitNode. By owning the exitnode the attacker can inject javascript code or a Java applet into the requested pages that it transmits back to the original TOR user that will help him unveil the identity of the TOR user.

Speed Issues

Unlike other applications that offers anonymous surfing, TOR is quite slow due to the fact that everything is encrypted, and being transmitted through 3 servers around the world.
While using TOR you should expect getting an average of ~20-35kbps, depends on the location and bandwidth of the nodes you surf through.

torproject.org offers a package called Vidalia Bundle, that includes Tor + Privoxy + Vidalia (graphical interface for Tor) + Torbutton (An extension for firefox).