Friday, March 26, 2010

HTTP Headers Part I

When analyzing web traffic, it is important know what you are reading. With the prevalence of botnets and malware almost anywhere a user goes on the internet, said knowledge of how to read the HTTP traffic may make the difference between catching an infection, and turning your network over to some botnet author with acne, pale skin, and too much RBBAC (Red Bull Blood Alcohol Content).

What I plan to put here now is just some basic information involving HTTP Headers. Time permitting and memory working, I am going to try to go a little deeper and cover other important aspects of this traffic, using WireShark screen shots. I think I would eventually like to expand this to include other web traffic protocols such as HTTPS, SSL(TLS), DNS, etc.

For anyone who may not know:
HTTP stands for Hypertext Transfer Protocol. According to the RFC, this is an "application-level protocol for distributed, collaborative, hypermedia information systems." It has been around since at least 1990 and its current version is 1.1. HTTP 1.1 is defined rather extensively in 2616 [1] (so I have no plan to summarize this entire RFC, just hitting what I think are important or misunderstood parts of the header as related to intrusion analysis).

HTTP communication involves a client and a server. A server in this instance does not have to be an actual Web/Domain/Mail server, but is any system that will respond, typically on ports 80 or 8080, to HTTP requests. Likewise, the client is any system that can send HTTP requests and handle HTTP responses.


HEADERS:
There are three basic types of HTTP Headers: general-header, request-header, response-header, and entity-header

Fields:
Connection - This field specifies options for a particular connection that MUST NOT be passed further by proxies. This field should NOT include end-to-end fields (Cache-Control is a great example given in the RFC). End-to-end headers are those headers that are necessary for the client/server communication, or that specify client/server communication that would be useless in any other form. The reason that the Cache-Control is a great example is that this field tells the client handler how long to keep a page cached, if at all, in addition to a few other parameters that are optional. The client would have to communicate all the way back to the server for a refresh. In the same sense, the server writes down the Cache-Control header to the client, so having this data parsed and dropped by proxies would make the Cache-Control field useless. As another note on this field, and one that is covered by the RFC, the "close" option for this header is important in that it signals that the connection must/will be closed after response completion.

Content-Encoding - This represents what type of coding (compression) is used on the data being transferred. The most common token I have seen is the gzip token, which indicates a file compressed with GZIP. There are four definitely registered values for this field. There can be more used, even private schemes unknown to anyone but a bad actor, as this fields tokens are only encouraged ("SHOULD") to register with IANA.

Content-Type - This is one of my more favorite header fields. This field, normally, will tell you what type of data to expect in that portion of the traffic. For example: image/jpeg would indicate a jpeg (picture) file is in the same stream of traffic. This also means that the start of the data should contain one of the jpeg file headers, JFIF for example, and not a file header from a different type, such as MZ for a Windows executable.

Content-Location - Another interesting field. This typically states where the requested resource is located. For example, if a client is requesting a pdf file from server.com, this field may also contain the relative URI to the resource, in this case it could be: http://someserver.com/someFile/BadGuy/bad.pdf. The interesting thing here is that I have seen exploits that will drop a temporary pdf file on the client machine and then use subsequent traffic to call that file...and the Content-Location will have the location as a folder on the client machine (c:\temp\bad.pdf). Additionally, this can also be used for re-directs, which I have scene a LOT with fake anti-virus issues. A connection between your box and IPA 1 may exist, a script may run or a button clicked that initiates a GET request for the malware and the Content-Location field will have something other than the server that was initially connected to.

Referer - Yes...it's spelled incorrectly. Furthermore, it can be programmatically set to a bad URI. This field is used in a request-header to "document" the location of where the Request-URI came from. Basically, it's "give me this object" (Request-URI) that I found the address to at "this site." Clicking on a link on the MLB home page that leads to the main Detroit tigers page would contain the Request-URI of "www.detroittigers.com" and a Referer of "www.mlb.com." This field is used to create track-backs links. The security concern here that every analyst should be aware of is: this field is not always accurate and could have been programmatically changed. Because this field can be used to create a list for optimized caching, it can by programmatically changed in order to have the URI "refreshed" from a bad actor.

Accept-Language - This indicates that language that the client would like the requested resources to be formatted in. This is an Internationalization (I18N) comparability issue, but does produce something interesting for analyst. If I see that a requester's Accept-Language token is set to "en-ca" and I know that Canada always tries to infiltrate my network, I would be more inclined to include this traffic in deeper research, eh. If nothing else, it would allow me to select multiple items for aggregation prior to analysis.


References:
[1] http://www.w3.org/Protocols/rfc2616/rfc2616.html

Tuesday, March 23, 2010

Call-By-Value/Call-By-Reference

How many times do programmers, from the fledgling novice to the expert, face the question of Call-By-Value/Call-By-Reference? This question comes in the form of someone wanting to better understand, in the form of deciding which to use in a particular programming, or in the form of the default method of parameter passing for a particular language.

Now, tonight I really have no desire to compare languages and their particualar method(s) of parameter passing. What I plan to do here is extend an answer that I recently passed onto someone else. So what follows is just a 'copy and paste' of an email I sent earlier...so I hope I didn't mistype anything! :-)

--from email--

Call-by-value and call-by-reference can be a little confusing, especially when you consider that they are implemented differently in different languages. One note of caution before going further is that in some languages, passing a pointer is still "technically" a call-by-value parameter until you actually deference it in the function; likewise, passing of the actual address ( &addOfMyVar ) is the actual address until you reference it's value. So the caution here is that, when working with pointers, be careful to deference/reference as needed. Not taking care of this could/would produce undefined/undesired results or errors.


Generally, I can't think of a normal situation where you would explicitly return by reference. To export data from a subprogram/function/etc, you would normally do one of two things:
Assign a variable the value of the return:
- someVar = myFunction(int x, int y, int z);
of, you would pass-by-reference those variables that you need changed in the caller's scope to the submodule
- myFunction (int x as ref, int y as ref, int z as ref);


That all said:
Given the below example, the full output should be expected to be:
3 2 1 // first write of submodule
4 5 6 // second write of submodule
6 2 3 // final write of main, after submodule has returned.

Example:

Main
Declare X, Y, Z As Integer
Set X = 1
Set Y = 2
Set Z = 3
Call Display (Z, Y, X)
Write X + " " + Y + " " + Z End Program
Subprogram Display ( Integer Num1, Integer Num2, Integer Num3 As Ref)
Write Num1 + " " + Num2 + " " + Num3
Set Num1 = 4
Set Num2 = 5
Set Num3 = 6
Write Num1 + " " + Num2 + " " + Num3
End Subprogram

Output Explained:
The reason for this:
- 3 2 1 // line 1 of output
- You are calling Display(3, 2, 1)
- The subprogram will first display these values as they are passed.
- 3 and 2 are both copies, and that is what is being displayed, the local (scoped to the subprogram) copy of num1('Z') and num2('Y').
- 1 is being passed as a reference, so num3 will 'basically' be the same address location (and value, by defualt) of the variable passed...in this case num3 address == 'X' address. Changes to num3 will in actuality be changes to 'X'.

- 4 5 6 // line 2 of output
- You are still in Display(3,2,1)
- You have already displayed the initial values passed
- The subprogram changes the local values* of num1, num2, and num3 to 4, 5, 6 respectively.
- * in this case, num3 is in actuality 'X', so 'X' is what is being changed, both the local scope of the subprogram and in the scope of main.
- You display the local num1, num2, num3 (see comment above) and exit

- 6 2 3 // final ouput after subprogram returns
- At this point, the variables used in the subprogram are now out of scope
- Main is using the variables X, Y, and Z
- Becuase X was passed by reference to the subprogram, where it was changed, the local (actually global in this case) value for X will have been changed.

--end email--

Saturday, March 20, 2010

Moving Forward on Reversing (Certification)

Sometimes I suprise myself and find that I really do know a little about some arbitrary topic. In this instance, I proved to myself that I really do know a little about Reverse Engineering Malware. Although this is something I partially (deobfuscating code, code analysis, etc.) do on a regular basis, I questioned my level of overall skill.

So, how did I suprise myself? I was given a practice exam (by someone who had some faith in me) for the GIAC Reverse Engineering Malware certification, and I decided to take it last night while at work. I took the practice exam cold and with no notes intentionally in order to gage where I was at on the topics.

Results:
Unfortunately, I did fail the exam. A 70% is required and I recieved a ~69%. The areas that I messed up on all involved either specific debuggers or some command line tools. I took this test in a little over an hour.

Why am I happy about this? I haven't had any real time in quite a while to play with debuggers, so getting questions regarding those wrong was expected, and an easy fix! Some time spent with some different tools and some review of some notes...no Problem there! The command line tools issue is also an easy fix: I just need to go back to basics on some things and use all the command line tools I can, when I can so that I stay fresh on there availability and options.

Path Forward on Reversing: The important thing to me about taking this test is that I took it cold and quick, and only failed by one question. What this means is that with a little bit of review and practice on forgotten tools and techniques, I should be able to sit the GREM and pass with a very good score! The GREM, GCIH, and GSNA (or maybe CEH), and MCSE are the remaining certs that I want to get. I am not worried about the GCIH, and now the GREM seems like I will do very well also, so this should be a productive year for me as far as certifications!

Friday, March 19, 2010

A brand new OGRE

I thought I would mention here that OGRE3D has a new, stable release. OGRE is an open source 3D graphics engine. I have written about OGRE before and I believe it to be an awesome tool that is well supported and has many great third-party enhancements. An interesting point here is that my last posting on here was five days before the most recent OGRE release (Feb 28, 2010).

From www.ogre3d.org:
"We’re very pleased to announce that OGRE 1.7.0 (Cthugha) has now been released, this version is now considered to be our stable release branch."

If you develop ANYTHING for 3D applications and you haven't checked out OGRE (no pun inteded), I think you are seriously missing something.

I had a project last year that involved using OGRE to create a driving simulator that incorporated a collision-avoidance algorithm. The project was 80% of where I wanted it, I got a great grade on it, and plan to extend it. The continued support of OGRE, and its newest release mean that, after I finish this semester, it should be relatively painless to get back on this project. In short, I am just excited to find that OGRE3D is alive and well, and moving forward!