Bash script help - removing certain rows from .csv file


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Bash script help - removing certain rows from .csv file
# 1  
Old 04-23-2013
Bash script help - removing certain rows from .csv file

Hello Everyone,

I am trying to find a way to take a .csv file with 7 columns and a ton of rows (over 600,000) and remove the entire row if the cell in forth column is blank.

Just to give you a little background on why I am doing this (just in case there is an easier way), I am pulling information from a PCAP into a .csv file and I only want to view the rows from the .csv file if it lists something in the http.host (forth column) entry (i.e. google.com). If that entry is blank because it is not a http.host website then I would like to remove the row. By doing this it would seriously cut down on the amount of rows I have to review to make sure my users are not visiting sites that they should now be.

So far my script looks like this:
Code:
#/bin/bash

echo -n "What is the name of your PCAP file? "
read in_pcap

echo -n "What is the name of your CSV file? "
read out_csv

tshark -r "$in_pcap" -T fields -e frame.number -e ip.src -e ip.dst -e http.host -e frame.time -e frame.time_relative -E header=y -E separator=, > "$out_csv"

_____
I ran the script on a current PCAP and it wors like a charm getting the information I need from a pcap file to a csv file unfortunately I am running into the aforementioned blank row situation as every entry does not list a value in the http.host cell. In fact of the over 600,000 I am guessing there are only several hundred rows that I need. So adding to the script above (or creating a new script if need be) to remove rows with a blank entry in the forth column of every row would be the perfect solution however I am not sure how to do that. The condition that needs to be met for the loop (assuming a loop is the solution) for the loop to stop would be for each of the 7 columns to be blank a.k.a. the row after the last of the 600,000+ entries.

Can anyone help me edit my current script and or write a new script to loop over (or otherwise remove) blank entries?

Thanks in advance!

Last edited by Franklin52; 04-24-2013 at 04:15 AM.. Reason: Please use code tags
# 2  
Old 04-23-2013
I don't know "tshark" but having done a google search, I think you should look into using the "-R <read/display filter>" option.

Can't be of more help other than to provide this link:

tshark - The Wireshark Network Analyzer 1.8.0

Also adding this link, which talks about the syntax of a filter:

http://www.wireshark.org/docs/man-pa...rk-filter.html

Last edited by rwuerth; 04-23-2013 at 11:55 AM.. Reason: added link
# 3  
Old 04-23-2013
Code:
tshark -r "$in_pcap" -T fields -e frame.number -e ip.src -e ip.dst -e http.host -e frame.time -e frame.time_relative -E header=y -E separator=, |egrep -v '^[^,]*,[^,]*,[^,]*,,'> "$out_csv"

# 4  
Old 04-24-2013
Quote:
Originally Posted by Skrynesaver
Code:
tshark -r "$in_pcap" -T fields -e frame.number -e ip.src -e ip.dst -e http.host -e frame.time -e frame.time_relative -E header=y -E separator=, |egrep -v '^[^,]*,[^,]*,[^,]*,,'> "$out_csv"

Thanks a million for this answer!

I went from 600,000+ rows to analyze to less than 3100 and it did everything I wanted it to.

I do have one problem though. If I run this line:
tshark -r test.pcap -T fields -e frame.number -e ip.src -e ip.dst -e http.host -e frame.time -e frame.time_relative -E header=y -E separator=, |egrep -v '^[^,]*,[^,]*,[^,]*,,'> test.csv
Everything works perfectly fine.

However if I run my script mentioned above with the variables $in_pcap and $out_csv, I get the following error message:
./pcapAnalyze: line 9: $out_csv: ambiguous redirect
tshark: Output fields were specified with "-e", but "-Tfields" was not specified.


I wanted to make sure the command ran on its own without the variables to limit the amount of things that could go wrong. After simply replacing the hard coded .pcap and .csv files with variables, I get that error message. The only thing that I changed was implementing the variable... Why is it doing that?

Thanks in advance!
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Removing top few and bottom few rows in a file

Hi, I have a requirement where I need to delete given number of top and bottom rows in a flat file which has new line as its delimiter. For ex: if top_rows=2 & bottom_rows=1 Then in a given file which looks like: New York DC LA London Tokyo Prague Paris Bombay Sydney... (7 Replies)
Discussion started by: calredd
7 Replies

2. Shell Programming and Scripting

Removing Duplicate Rows in a file

Hello I have a file with contents like this... Part1 Field2 Field3 Field4 (line1) Part2 Field2 Field3 Field4 (line2) Part3 Field2 Field3 Field4 (line3) Part1 Field2 Field3 Field4 (line4) Part4 Field2 Field3 Field4 (line5) Part5 Field2 Field3 Field4 (line6) Part2 Field2 Field3 Field4... (7 Replies)
Discussion started by: ekbaazigar
7 Replies

3. Shell Programming and Scripting

Removing rows and chars from text file

Dear community, maybe I'm asking the moon :rolleyes:, but I'm scratching my head to find a solution for it. :wall: I have a file called query.out (coming from Oracle query), the file is like this: ADDR TOTAL -------------------- ---------- TGROUPAGGR... (16 Replies)
Discussion started by: Lord Spectre
16 Replies

4. Shell Programming and Scripting

Shell script to count unique rows in a CSV

HI All, I have a CSV file of 30 columns separated by ,. I want to get a count of all unique rows written to a flat file. The CSV file is around 5000 rows The first column is a time stamp and I need to exclude while counting unique Thanks, Ravi (4 Replies)
Discussion started by: Nani369
4 Replies

5. Shell Programming and Scripting

Please do help: Perl Script to pull out rows from a CSV file

I have CSV file that contains data in the format as shown below: ABC, 67, 56, 67, 78, 89, 76, 55 PDR, 85, 83, 83, 72, 82, 89, 83 MPG, 86, 53, 54, 65, 23, 54, 75 .. .. .. .. I want to create a script that will pull out the rows from the above sheet and paste it into another CSV file.... (12 Replies)
Discussion started by: pankajusc
12 Replies

6. Shell Programming and Scripting

Need to modify csv-file with bash script

Hi Guys, I need to write a script, that exports the "moz_places" table of the "places.sqlite"-file (firefox browser history) into a csv-file. That part works. After the export, my csv looks like this: ... 4429;http://www.sqlite.org/sqlite.html;"Command Line Shell For... (11 Replies)
Discussion started by: Sebi0815
11 Replies

7. Shell Programming and Scripting

Removing rows based on a another file

Hi, I am not sure if this has already been asked (I tried the search but the search was too broad). Basically I want to remove rows based on another file. So file1 looks like this (tab seperated): HHN 3 5 5 HUJ 2 2 1 JJJ 3 1 1 JUN 2 1 3 I have another file (file2)... (2 Replies)
Discussion started by: kylle345
2 Replies

8. Shell Programming and Scripting

Removing rows from a file

I have a file like below and want to use awk to solve this problem. The record separator is ">". I want to look at each record section enclosed within ">". Find the row with the 2nd and 3rd columns being 0, such as 10 0 0 I need to take the first number which in this case is 10. Then... (15 Replies)
Discussion started by: kristinu
15 Replies

9. Shell Programming and Scripting

Deleting rows from csv file

Hello, I am supposed to process about 100 csv files. But these files have some extra lines at the bottom of the file. these extra lines start with a header for each column and then some values below. These lines are actually a summary of the actual data and not supposed to be processed. These... (8 Replies)
Discussion started by: cobroraj
8 Replies

10. Shell Programming and Scripting

Bash Script - Removing Year and Text from File Name

Hi, I have files with names in the following naming pattern, Name.of.moviemoretext-moretext.mov or Name of moviemoretext-moretext.mov And I would like to delete the and moretext-moretext leaving just the Name of movie. The Name of movie will always be different and the year will... (4 Replies)
Discussion started by: Monkey Dean
4 Replies
Login or Register to Ask a Question