Remove headers thar dont match


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Remove headers thar dont match
# 1  
Old 12-04-2012
Linux Remove headers thar dont match

Good evening
I need your help please, im new at Unix and i wanted to remove the first 5 headers for 100000 records files and then create a control file .ctl that contains the number of records and all seem to work out but when i tested at production it didnt wotk.

Here is the code:
Code:
#! /usr/bin/sh
inicio=$(date +%s)
for cdr in $(ls -lrt ICT-DW*.CSV|awk '{$3 == dsadm;print $9}')
do
  tail -n +6 $cdr > ${cdr}_temp
  mv ${cdr}_temp $cdr
  count="`wc -l $cdr`"
  ctrfile=`echo $cdr|sed s/CSV/ctl/`
  echo $count > $ctrfile
done
final=$(date +%s)
dif=$(( final - inicio ))
echo "ot took $dif seconds"

while runnig this script it yielded 2 undesired results for me:

1. In this script i indended to process files ICT-DW*.CSV whose owner is only dsadm because in the directory there are 2 owners i.e:
Code:
rw-r-----   1 fileload   dstage     31340817 Nov 21 08:13 ICT-DW-20121121_144654.CSV
-rw-r-----   1 fileload   dstage     31414285 Nov 21 08:13 ICT-DW-20121121_144655.CSV
-rw-rw-rw-   1 dsadm    dstage     30805901 Nov 21 05:44 ICT-DW0121121_192367.CSV
-rw-rw-rw-   1 dsadm      dstage     30797086 Nov 21 05:44 ICT-DW-20121121_192368.CSV

2.the header format is quite confused because i asummed that the header has 5 lines according to the following format but using putty or changing the resolution it shows me more then 5 lines or up To 12 lines:
Code:
$ head ICT-DW-20121122_144811.CSV
AARHDR,ICT6.0.7SP3,PRICING,DW_CDR,CSV,20121122 001052
AARTAB,Stored CDR,20121122 001052
AARCOL,FK_CDR_SOURCE_FILE_NUMBER,CDR_NUMBER,CDR_ORIGIN,RECORD_SEQUENCE_NUMBER,LINK_FIELD,RECORD_TYPE,CALL_START_DATE,CALL_SEGMENT_DIRECTION,CHARGE_DIRECTION,NETWORK_OPERATOR_ID,FRANCRATE_USED,COMP_DATA_UNIT,RATE_UNIT_VOLUME
AARSIZ,12,9,1,15,2,2,YYYYMMDD,8,14,10,28,28,1,8,10,8,4,1,1,YYYYMMDD HH24MISS,20,249,6,14,1,3,5,3,8,4,7,20,8,3,8,4,30,1,8,3,1,3,11,11,8,3,5,2,3,13,13,6,9,13,15,12,13,8,14
AARTYP,NUMBER,NUMBER,CHAR,VARCHAR2,VARCHAR2,VARCHAR2,DATE,VARER,NUMBER,NUMBER,NUMBER,NUMBER,VARCHAR2,NUMBER
,2060024,1001,F,203,01,00,20121120,23513300,35.00,0000003500,03183260533,03212046506,X,,0,,,N,O,20121122 001052,EDFN,,,0,O,STD,COMCE,BSC,COMBUCBG,TAND,57,0406,MOB-COM,SAL,SAL-MOB,MM08,Telefonos de MINUTO,1.00,1,0.000000,0.00,0.00,,0.00,
,2060024,1001,F,203,01,00,20121120,23513300,35.00,0000003500,03183260533,03212046506,X,,0,,,N,O,20121122 001052,EDFN,,,0,O,STD,COMCE,BSC,COMBUCBG,TAND,57,0406,MOB-

According to above header the records start in ,2060024,1001,F,203,01,00,20121120,etc so the number for the header is variable if you use a different resolution for the screen? if so how can i set it up ? if not i think is there another option with awk for example if i asked if field1 starts with AAR and the last field return a carriage return then i could remove that record.

Thanks in advanced.

Last edited by Scott; 12-05-2012 at 03:19 AM.. Reason: Code tags
# 2  
Old 12-04-2012
Please use [ code ] tags.
# 3  
Old 12-04-2012
Code:
#! /usr/bin/sh
inicio=$(date +%s)
for cdr in $(ls -lrt ICT-DW*.CSV|awk '{$3 == dsadm;print $9}')
do
  tail -n +6 $cdr > ${cdr}_temp
  mv ${cdr}_temp $cdr
  count="`wc -l $cdr`"
  ctrfile=`echo $cdr|sed s/CSV/ctl/`
  echo $count > $ctrfile
done
final=$(date +%s)
dif=$(( final - inicio ))
echo "ot took $dif seconds"


Last edited by Scott; 12-05-2012 at 03:18 AM.. Reason: Code tags
# 4  
Old 12-04-2012
Nah do "[\code]" and "[\/code]"

Without the " " and \
# 5  
Old 12-04-2012
OK, im sorry , but i didnt know if i understood, i clicked in the code tag button and pasted the code,

Code:
#! /usr/bin/sh
inicio=$(date +%s)
for cdr in $(ls -lrt ICT-DW*.CSV|awk '{$3 == dsadm;print $9}')
do
  tail -n +6 $cdr > ${cdr}_temp
  mv ${cdr}_temp $cdr
  count="`wc -l $cdr`"
  ctrfile=`echo $cdr|sed s/CSV/ctl/`
  echo $count > $ctrfile
done
final=$(date +%s)
dif=$(( final - inicio ))
echo "it took $dif seconds"]


Last edited by Scott; 12-05-2012 at 03:18 AM.. Reason: Code tags
# 6  
Old 12-04-2012
Hmm....so you know in HTML you do <head></head>? Replace head with code and the <> with [].
# 7  
Old 12-04-2012
OK, let me tell you briefly what i want, i ve code a script called removeheaders which is:
Code:
#! /usr/bin/sh
inicio=$(date +%s)
for cdr in $(ls -lrt ICT-DW*.CSV|awk '{$3 == dsadm;print $9}')
do
tail -n +6 $cdr > ${cdr}_temp
mv ${cdr}_temp $cdr
count="`wc -l $cdr`"
ctrfile=`echo $cdr|sed s/CSV/ctl/`
echo $count > $ctrfile
done
final=$(date +%s)
dif=$(( final - inicio ))
echo "it took $dif seconds"

but Ive got 2 problems:
1. i wanted to process only files ICT_DW*.CSV whose owners is dsadm and not fileload owner(which part of the code ive got to correct ?
2.I want to strip some text that is between AARHDR and NUMBER for each file: here is the text:
Code:
AARHDR,ICT6.0.7SP3,PRICING,DW_CDR,CSV,20121122 001052
AARTAB,Stored CDR,20121122 001052
AARCOL,FK_CDR_SOURCE_FILE_NUMBER,CDR_NUMBER,CDR_ORIGIN,RECORD_SEQUENCE_NUMBER,LINK_FIELD,RECORD_TYPE ,CALL_START_DATE,CALL_SEGMENT_DIRECTION,CHARGE_DIRECTION,NETWORK_OPERATOR_ID,FRANCRATE_USED,COMP_DAT A_UNIT,RATE_UNIT_VOLUME
AARSIZ,12,9,1,15,2,2,YYYYMMDD,8,14,10,28,28,1,8,10,8,4,1,1,YYYYMMDD HH24MISS,20,249,6,14,1,3,5,3,8,4,7,20,8,3,8,4,30,1,8,3,1,3,11,11,8,3,5,2,3,13,13,6,9,13,15,12,13,8,1 4
AARTYP,NUMBER,NUMBER,CHAR,VARCHAR2,VARCHAR2,VARCHAR2,DATE,VARER,NUMBER,NUMBER,NUMBER,NUMBER,VARCHAR2 ,NUMBER
,2060024,1001,F,203,01,00,20121120,23513300,35.00,0000003500,03183260533,03212046506,X,,0,,,N,O,2012 1122 001052,EDFN,,,0,O,STD,COMCE,BSC,COMBUCBG,TAND,57,0406,MOB-COM,SAL,SAL-MOB,MM08,Telefonos de MINUTO,1.00,1,0.000000,0.00,0.00,,0.00,

Id appreciate your help

Last edited by Scott; 12-05-2012 at 03:17 AM.. Reason: Code tags, please...
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Solution which is works perfect if all headers match from file b

HI Guys, I have file A.txt Code: ID,L1,L2,L3,L4 1A,2a,33a,44b,55c 2A,10a,14a,15b,16c File B.txt Code: ID L1 L4 L5 Output:- Code: (3 Replies)
Discussion started by: huuanh2811
3 Replies

2. Shell Programming and Scripting

Remove white space and duplicate headers

I have a file called "dsout" with empty rows and duplicate headers. DATE TIME TOTAL_GB USED_GB %USED --------- -------- ---------- ---------- ---------- 03/05/013 12:34 PM 3151.24316 2331.56653 73.988785 ... (3 Replies)
Discussion started by: Daniel Gate
3 Replies

3. Shell Programming and Scripting

Faster command to remove headers for files in a directory

Good evening Im new at unix shell scripting and im planning to script a shell that removes headers for about 120 files in a directory and each file contains about 200000 lines in average. i know i will loop files to process each one and ive found in this great forum different solutions... (5 Replies)
Discussion started by: alexcol
5 Replies

4. Shell Programming and Scripting

Remove interspersed headers in .dat file with AWK

Heya there, A small selection of my data is shown below. DATE TIME FRAC_DAYS_SINCE_JAN1 2011-06-25 08:03:20.000 175.33564815 2011-06-25 08:03:25.000 175.33570602 2011-06-25 ... (4 Replies)
Discussion started by: gd9629
4 Replies

5. UNIX for Dummies Questions & Answers

sed remove two headers; writing more elegant code

Hi there, I have two questions. First, I was wondering how to use sed to remove two header lines or two tail lines. Here I just do the same operation twice...I'm sure there is a better way. Second, and more importantly, is there a better way to have these operations use files other than... (5 Replies)
Discussion started by: mikey11415
5 Replies

6. UNIX and Linux Applications

mysql query all entries which 'dont' match

I am trying to query a list of hosts and extract all entries which 'dont' match. SELECT LOGS.host, GOODLIST.host FROM LOGS,db.GOODLIST WHERE (LOGS.host <> GOODLIST.host)When I use this query, it is very very slow. Matching the host with the GOODLIST.host works great and fast but when I use <>... (1 Reply)
Discussion started by: hazno
1 Replies

7. Shell Programming and Scripting

Remove text between headers while leaving headers intact

Hi, I'm trying to strip all lines between two headers in a file: ### BEGIN ### Text to remove, contains all kinds of characters ... Antispyware-Downloadserver.com (Germany)=http://www.antispyware-downloadserver.c om/updates/ Antispyware-Downloadserver.com #2... (3 Replies)
Discussion started by: Trones
3 Replies

8. UNIX for Dummies Questions & Answers

Remove certain headers using mailx or sendmail

Hello, So i want to send mails in any way from a solaris 5.8 system, perhaps using mailx or sendmail. My purpose is to stay clear of systems name in head data. So i want to strip at least the "Message-Id" and the "Recieved" headers of the mail. Yet this seems to be a bit of a problem. Now i... (2 Replies)
Discussion started by: congo
2 Replies

9. Shell Programming and Scripting

Remove Headers throughout a data file

I have a data file with over 500,000 records/lines that has the header throughout the file. SEQ_ID Name Start_Date Ins_date Add1 Add2 1 Harris 04/02/08 03/02/08 333 Main Suite 101 2 Smith 02/03/08 01/23/08 287 Jenkins SEQ_ID Name ... (3 Replies)
Discussion started by: psmall
3 Replies

10. UNIX for Dummies Questions & Answers

help:how to remove headers in output file

Hi I am running a script (which compares two directory contents) for which I am getting an output of 70 pages in which few pages are blank so I was able to delete those blank lines. But I also want to delete the headers present for each page. can any one help me by providing the code... (1 Reply)
Discussion started by: raj_thota
1 Replies
Login or Register to Ask a Question