Tuesday, February 23, 2010

Combining the pcap files quickly

<>
An updated to this post, based on the last comment that I had recieved:
I agree that Pcapjoiner and some other tools can do this quickly, as well as add some other functions.

I like the fact though that there are these tools built into Wireshark that allows for the quick combo of just a few targeted pcaps. Basically, a way to get to the down and dirty of analysis on one more connections.
This has given me a GREAT idea for two/three/four more blog posts that would love to do:
1) building my own interface to do these merges, and other massaging that might be helpful. I am already picturing a ton of ways to go with this...maybe best to keep it simple...but it could be a fun projec for myself...adding the ability to run som stats, filters, create some xml/xhtml/html output in addition to output usalbe by tcpdump, wireshark, and ngrep.
2) to play around with the options of mergecap from the command line and try to add some filters by piping to/from ngrep or tcpdump. I think this should work just fine, and would allow for a larger number of files to be processed easily by mergecap in the dump.
3) a perl script that I can just drag a group of files onto for the merge. Perl's CPAN modules provide some excellent support for network traffic
4) a perl script to strip out whatever I want from a fully captured session: the webpage, a pic, the VoIP call, etc. This one might be a little harder....but sounds REALLY fun to me.

dw
<>



The other day I finally became fed up with the process of using the Wireshark GUI on Windows to combine more than two PCAP files. I think some folks I know would give me a "Gibbs' Slap" if they knew just how many times I used the GUI to combine 15+ captures. (If you don't know what a "Gibbs' slap" is, you REALLY need to start watching the original NCIS, and NO, not the lousy NCIS:Los Angeles)
Unless things have changed (and I admit to not recently trying), it is generally easy in *nix to pass/search a directory of *.pcap files to the Mergecap.exe util of the Wireshark release, combining ALL the PCAP files into a specified output file. However, and I know this is [NOT] a shock to most people, it is not always as easy to do this same thing on the windows command line (which I was stuck using for this). Of course, I have 20-30 more years before I am a cmd line ninja, so there may be a very easy way to do this...but I don't know it and my friend Google couldn't find it. This left me with a huge whole in my life as I REALLY wanted a better and FASTER way to do this.

Before beginning this walk down my pcap-crazy mental train track, just a quick recap of how to use mergecap.exe:

Usage: mergecap [options] -w ...

So if I just want to merge some pcap files from a desktop folder into a file called merged.pcap:

C:\Program Files\Wireshark\mergecap -w merged.pcap "c:\users\UnixUsersAreCooler\Desktop\Some Pcap Files\1.pcap" "c:\users\UnixUsersAreCooler\Desktop\Some Pcap Files\2.pcap" "c:\users\UnixUsersAreCooler\Desktop\Some Pcap Files\3.pcap"

This will combine the 1.pcap, 2.pcap, and 3.pcap files into the newly created merged.pcap. However, in case it went un-noticed, that is a LOT of typing to combine three files. Isn't there an easier way?

The Choices:
1) Write a GUI that let me quickly select multiple files, creates the command line string for mergecap with these files, and executes the command. Great! Except, do I really want to create a GUI to do this?
2) Write a command line program to parse a specified folder for all pcaps, create the command string, then execute.
3) Create a script or bat file to do what I want, when I want.
4) Give up and begin a life of cheap booze and cheaper women.

The answers:
1) Nope. Little bit to lazy to spend the time to create the GUI that will make me spend more time navigating directory structures and selecting n files.
2) Nope. Lazy...see number 1 above.
3) This sounds like the way to go.
4) Might work, but then wife and kids might become irritated with such a choice. Back to number 3.

So now that it is decided that I am lazy, and can't chase women or whiskey, it's on to the scripting. There are multiple options here as well, but I kept it simple, dug up some examples, tweaked them for me, and went back to watching Office Space.

The batch file:

combine.bat:

setlocal
set myfiles=
for %%f in (*.pcap) do set myfiles=!myfiles! %%f
Cmd /V:on /c "c:\Program Files\Wireshark\mergecap.exe" -w temp.pcap %myfiles%


What does this mean and where does it go? I created a folder on my desktop for the pcaps I want to merge; the bat file goes here. To run this, I could double-click the file, but I prefer to see it in action . With that in mind, I open up a command prompt in the folder where the bat file is stored, and then execute:
$>combine.bat

The important things I want to point out here is the "Cmd /V:on /c ...". What this does:
- CMD /V:on re-calls the cmd.exe from the system32 directory with the setting of the delaying environment variables (/V:on).
- /c means to "run the following command." In this situation, no environment variables for the PATH for mergecap exist, so I need to call it directly, passing the remainder of the string as the arguments.
- myFiles is an array of all .pcap files in the directory where the script resides. Without the "/V:on" option, only the last file name passed by the 'for' will be present when the command executes.

This entry took longer to create then the batch file, but I hope it helps some angry analyst somewhere.

15 comments:

  1. Cheers for this, really helped!

    However, I have a similar problem and Google cannot help their either...

    I want to run the -z io,phs stats on a range of files, with of course a combined output... Any suggestions?

    I have about 500 files :) all quite large beasts so I cant really merge them all as tshark complains...

    I tried letting tshark read them all sequentially but it only reads the last file you give it with the -r argument...

    Any help would be great!

    ReplyDelete
  2. Just a note that you may need to turn on delayed expansion:

    setlocal enabledelayedexpansion

    Or else the !myfiles! won't evaluate properly.

    ReplyDelete
  3. Awesome! I didn't think anyone was reading this. :-)
    @Jake: I would have to play with this some. Unfortunately, I have been doing more sys admin and software engineering the last 5 months, so packet analysis has only been for my free time (which hasn't existed this last month). That all said, I am definitely interested in your question and I hope to have some time this weekend to play around with it.
    @Anonymous: I believe you are right. However, I wrote this post a while ago and haven't had much time to play around with analysis in a bit, as I wrote above. From what I remember the /V option worked form me on in some situations (on an XP box, I think) but that I had turn on delayed expansion in Vista.
    @both: Thank you for the comments. I started this thing as a log for me, but the fact that others are reading it is somewhat cool. :-)

    ReplyDelete
  4. THANK YOU BOTH, works fine on xp & win7 (with
    setlocal enabledelayedexpansion)

    ReplyDelete
  5. Dave, this was very helpful. I have limited networking experience but am often called to troubleshoot networking in our industrial applications. The batch file worked well for me. I was trying understand how it worked. I was able rundown everything except the "!" around myfiles in the For loop?
    Could you please point me to a link explaining this.
    Thanks again.

    ReplyDelete
  6. The "!" used in the for loop is used for variable expansion. A good link for an explanation is below. The bottom line is that the "!somename!" syntax allows the expansion of somename at a delayed time (not at the initial load of the bat/script/etc.). So if you use "setlocal enabledelayedexpansion," but did not use the bang operator before and after the variable, it will not work as needed...the bang says "hey, this variable needs to wait until execution to be expanded (as opposed to during script init/read). Hope that helps!

    http://www.computerhope.com/sethlp.htm#03

    ReplyDelete
  7. May you please suggest me what to put in place of "set myfiles=" as it is showing me no directory

    ReplyDelete
  8. @Anonymous: Think of "Set MyFiles=" as just a variable decleration. The value gets assigned in the for loop where it assigns all pcap files in the current directory. Hope that helps. dw

    ReplyDelete
  9. Hello David!

    I need to merge 106 .pcap files, and I tried using your file but everytime I do I get the following error:

    "mergecap: Can`t open !myfiles!: No such file or directory"

    What do I do now??
    Thank you

    ReplyDelete
    Replies
    1. Without actually seeing the rest of your script, it's a little hard to say exactly what the problem is, although I am leaning towards a syntantical culprit with this one. I would say though that while mergecap "should" handle 106 pcap files, I would personally break up the task.
      Some other things to check: from what folder are you running the batch file and where in relation to this folder is the folder with all of the pcaps.
      If you want to post your whole batch file in a reply, I will take a look at it and see if I can quickly see what might be wrong.
      DW

      Delete
    2. Anonymous.... You need to include this with the setlocal
      setlocal enabledelayedexpansion

      That will let the !myfile! concatenate

      Delete
  10. my modifications inline below: takes two parameters for the filter and the output file, bit messy but it helps ;)

    SETLOCAL EnableDelayedExpansion
    set myfiles=
    set param1=%1
    set param2=%2
    IF DEFINED param1 (set OUTFILE=%1)
    else (set OUTFILE=temp)
    IF DEFINED param2 (set FILTER=%2)
    else (set FILTER=*)
    for %%f in (%FILTER%.pcap) do set myfiles=!myfiles! %%f
    Cmd /V:on /c "c:\Program Files\Wireshark\mergecap.exe" -w %OUTFILE%.pcap %myfiles%

    ... I was going to add the preprocessing too, which would prefilter with tshark a defined string so you've got an even quicker way to filter and merge your massive captures... after getting 4.8GB of caps from a customer this afternoon ;) after filtering I'm down to only 780MB!

    ReplyDelete
    Replies
    1. Scott,

      I just saw you commented on this. :-( Anyway, I like the changes you made. My initial posting follows me true to form...I create tools/scripts to solve a problem for me (and only me). So when I share them, I tend to forget to add in any sort of error/param checking.

      I think the idea of a prefilter with tshark is awesome! I do wonder about the cost, mostly in terms of wall clock time. It would be interested to run some tests on a few ranges of files. It just so happens that I have a TON (over 2TBs of compressed data) of pcaps that I need to start doing some work on, so I will be able to test the prefiltering idea. I will edit the original post with any results.

      Delete
    2. Um, you can just do:
      "c:\Program Files\Wireshark\mergecap.ex
      e" -w temp.pcap file*.pcap

      and avoid all the batch file for loop stuff. HTH

      Delete
    3. Certainly true that you could just type in the command and wildcard the pcap files. However, the point of this was a simple tool that could be used repetitively, which in the long run saves me time. Of course my script above is just one way and if the "batch file for loop stuff" is too cumbersome or "strange" for some, they certainly have the option of typing the command and wild-carding it every time. That's what I love about my job...multiple correct solutions for almost any given problems.

      Delete