Please help !!!!Problem with data file


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Please help !!!!Problem with data file
# 29  
Old 04-07-2007
Quote:
Originally Posted by kumarsaravana_s
i did change it to 10...
You have '11' fields based on your sample file.
Here's the modified script based on YOUR/Kumar's requirement:
Code:
BEGIN {
  FS=OFS="|"

  FLD_max=11

  stderr="cat 2>&1"
}
NF == FLD_max { if (rec != "") print rec; rec=$0; fld=FLD_max;next}

NF < FLD_max {printf("Bad record: [%d] :: [%s]\n", FNR, $0) | stderr; rec=(rec != "") ? rec $0 : $0; fld+=(NF-1);next }
fld >= FLD_max { print rec; rec=""; fld=0 }
END {
  if (rec != "" && split(rec, a, FS) >= FLD_max ) print rec
}


Last edited by vgersh99; 04-07-2007 at 06:38 PM..
# 30  
Old 04-07-2007
if you have Python , here's an alternative:
input sample:
Code:
295468075|GOLDKIST|||||||0|||
295468077|SIEMENS MEDICAL CSG-INV|||||||0|||
295468079|KOSA/AREVA SPCLTY SARL|||||||0|||
295468081|IBM
|||||||0|||
295468083|COMCAST - INVOICING|||||||0||| 
295468081|IBM2
|||||||0||1|
295468081|IBM|TEST|||
||1|||0|
295468084|COMCAST - INVOICING|||T|E|S|T|0|||

code:
Code:
#!/usr/bin/python
f = open("file")
for num,line in enumerate(f):
     if line.count('|') < 11: #assume 11 pipe delimiters
          print line.strip(), f.next().strip() #print current line and next line joined
     else:
          print line.strip()

output:
Code:
# ./test.py
295468075|GOLDKIST|||||||0|||
295468077|SIEMENS MEDICAL CSG-INV|||||||0|||
295468079|KOSA/AREVA SPCLTY SARL|||||||0|||
295468081|IBM |||||||0|||
295468083|COMCAST - INVOICING|||||||0|||
295468081|IBM2 |||||||0||1|
295468081|IBM|TEST||| ||1|||0|
295468084|COMCAST - INVOICING|||T|E|S|T|0|||

# 31  
Old 04-09-2007
Quote:
Originally Posted by vgersh99
You have '11' fields based on your sample file.
Here's the modified script based on YOUR/Kumar's requirement:
Code:
BEGIN {
  FS=OFS="|"

  FLD_max=11

  stderr="cat 2>&1"
}
NF == FLD_max { if (rec != "") print rec; rec=$0; fld=FLD_max;next}

NF < FLD_max {printf("Bad record: [%d] :: [%s]\n", FNR, $0) | stderr; rec=(rec != "") ? rec $0 : $0; fld+=(NF-1);next }
fld >= FLD_max { print rec; rec=""; fld=0 }
END {
  if (rec != "" && split(rec, a, FS) >= FLD_max ) print rec
}

Hi Vgersh,

I changed the field to 11,but it is not joining the split column data back to its original column.Instead it is deleting the entire record itself.

> vi ds1.awk
"ds1.awk" 16 lines, 375 characters
#!/usr/bin/ksh

BEGIN {
FS=OFS="|"

FLD_max=11

stderr="cat 2>&1"
}
NF == FLD_max { if (rec != "") print rec; rec=$0; fld=FLD_max;next}

NF < FLD_max {printf("Bad record: [%d] :: [%s]\n", FNR, $0) | stderr; rec=(rec != "") ? rec $0 : $0; fld+=(NF-1);next }
fld >= FLD_max { print rec; rec=""; fld=0 }
END {
if (rec != "" && split(rec, a, FS) >= FLD_max ) print rec
}

"ds1.awk" 16 lines, 375 characters
> vi new.txt
"new.txt" 11 lines, 902 characters
9429732|Apr 4 2007 12:51:09:063AM| |CREDCHK |0|421|58|2592|PAR| |
9427428|Apr 4 2007 12:00:00:066AM|7736752|NETEVENT |3146628|937|307|5035| |TP|TP06173598941
9427429|Apr 4 2007 12:00:02:560AM|7736744|NETEVENT |3083574|940|765|1304| |TP|TP06173600979
9427430|Apr 4 2007 12:00:03:613AM|7736759|NETEVENT |3146568|781|307|531| |TP|TP06173582
254
9427431|Apr 4 2007 12:00:04:430AM|7736668|NETEVENT |6000177|712|899|2080| |TP|TP0547906557
9427432|Apr 4 2007 12:00:04:580AM|7736747|NETEVENT |1039574|716|957|2806| |TP|TP06173875607
9427433|Apr 4 2007 12:00:07:723AM|7736751|NETEVENT |1039980|646|596|6982| |TP|TP06173873938
9427434|Apr 4 2007 12:00:07:920AM|2799783|NETEVENT |3018155|510|648|4964| |MD|MD0130328
9427435|Apr 4 2007 12:00:08:290AM|2799781|NETEVENT |3022569|713|248|2027| |MD|MD0125661
9427436|Apr 4 2007 12:00:08:616AM|2799782|NETEVENT |3077955|757|345|1839| |MD|MD015546

"new.txt" 11 lines, 902 characters
> nawk -f ds1.awk new.txt > new1.txt
> vi new1.txt
"new1.txt" 11 lines, 926 characters
9429732|Apr 4 2007 12:51:09:063AM| |CREDCHK |0|421|58|2592|PAR| |
9427428|Apr 4 2007 12:00:00:066AM|7736752|NETEVENT |3146628|937|307|5035| |TP|TP06173598941
9427429|Apr 4 2007 12:00:02:560AM|7736744|NETEVENT |3083574|940|765|1304| |TP|TP06173600979
Bad record: [5] :: [254] /*the record is deleted*/
9427430|Apr 4 2007 12:00:03:613AM|7736759|NETEVENT |3146568|781|307|531| |TP|TP06173582254
9427431|Apr 4 2007 12:00:04:430AM|7736668|NETEVENT |6000177|712|899|2080| |TP|TP0547906557
9427432|Apr 4 2007 12:00:04:580AM|7736747|NETEVENT |1039574|716|957|2806| |TP|TP06173875607
9427433|Apr 4 2007 12:00:07:723AM|7736751|NETEVENT |1039980|646|596|6982| |TP|TP06173873938
9427434|Apr 4 2007 12:00:07:920AM|2799783|NETEVENT |3018155|510|648|4964| |MD|MD0130328
9427435|Apr 4 2007 12:00:08:290AM|2799781|NETEVENT |3022569|713|248|2027| |MD|MD0125661
9427436|Apr 4 2007 12:00:08:616AM|2799782|NETEVENT |3077955|757|345|1839| |MD|MD015546


"new1.txt" 11 lines, 926 characters
# 32  
Old 04-10-2007
Code:
BEGIN {
  FS=OFS="|"

  FLD_max=11

  stderr="cat 2>&1"
}
(fld + NF-1) > FLD_max {
       if (fld == FLD_max)
          print rec
       else
          printf("Incomplete record: [%d] :: [%s]\n", FNR, rec) | stderr
       rec=$0; fld=NF;next
}
NF < FLD_max {printf("Bad record: [%d] :: [%s]\n", FNR, $0) | stderr; rec=(rec != "") ? rec $0 : $0; fld+=(NF-1);next }
{rec=$0; fld=NF}
END {
  if (rec != "" && split(rec, a, FS) >= FLD_max ) print rec
}

# 33  
Old 04-13-2007
Quote:
Originally Posted by vgersh99
Code:
BEGIN {
  FS=OFS="|"

  FLD_max=11

  stderr="cat 2>&1"
}
(fld + NF-1) > FLD_max {
       if (fld == FLD_max)
          print rec
       else
          printf("Incomplete record: [%d] :: [%s]\n", FNR, rec) | stderr
       rec=$0; fld=NF;next
}
NF < FLD_max {printf("Bad record: [%d] :: [%s]\n", FNR, $0) | stderr; rec=(rec != "") ? rec $0 : $0; fld+=(NF-1);next }
{rec=$0; fld=NF}
END {
  if (rec != "" && split(rec, a, FS) >= FLD_max ) print rec
}

Hi vgersh,

Thanks a lot for your help but i'm still getting the same previous output.

"ds1.awk" 16 lines, 375 characters
> vi new.txt
"new.txt" 11 lines, 902 characters
9429732|Apr 4 2007 12:51:09:063AM| |CREDCHK |0|421|58|2592|PAR| |
9427428|Apr 4 2007 12:00:00:066AM|7736752|NETEVENT |3146628|937|307|5035| |TP|TP06173598941
9427429|Apr 4 2007 12:00:02:560AM|7736744|NETEVENT |3083574|940|765|1304| |TP|TP06173600979
9427430|Apr 4 2007 12:00:03:613AM|7736759|NETEVENT |3146568|781|307|531| |TP|TP06173582
254
9427431|Apr 4 2007 12:00:04:430AM|7736668|NETEVENT |6000177|712|899|2080| |TP|TP0547906557
9427432|Apr 4 2007 12:00:04:580AM|7736747|NETEVENT |1039574|716|957|2806| |TP|TP06173875607
9427433|Apr 4 2007 12:00:07:723AM|7736751|NETEVENT |1039980|646|596|6982| |TP|TP06173873938
9427434|Apr 4 2007 12:00:07:920AM|2799783|NETEVENT |3018155|510|648|4964| |MD|MD0130328
9427435|Apr 4 2007 12:00:08:290AM|2799781|NETEVENT |3022569|713|248|2027| |MD|MD0125661
9427436|Apr 4 2007 12:00:08:616AM|2799782|NETEVENT |3077955|757|345|1839| |MD|MD015546

"new.txt" 11 lines, 902 characters
> nawk -f ds1.awk new.txt > new1.txt
> vi new1.txt
"new1.txt" 11 lines, 926 characters
9429732|Apr 4 2007 12:51:09:063AM| |CREDCHK |0|421|58|2592|PAR| |
9427428|Apr 4 2007 12:00:00:066AM|7736752|NETEVENT |3146628|937|307|5035| |TP|TP06173598941
9427429|Apr 4 2007 12:00:02:560AM|7736744|NETEVENT |3083574|940|765|1304| |TP|TP06173600979
Bad record: [5] :: [254] /*the record is deleted*/
9427430|Apr 4 2007 12:00:03:613AM|7736759|NETEVENT |3146568|781|307|531| |TP|TP06173582254
9427431|Apr 4 2007 12:00:04:430AM|7736668|NETEVENT |6000177|712|899|2080| |TP|TP0547906557
9427432|Apr 4 2007 12:00:04:580AM|7736747|NETEVENT |1039574|716|957|2806| |TP|TP06173875607
9427433|Apr 4 2007 12:00:07:723AM|7736751|NETEVENT |1039980|646|596|6982| |TP|TP06173873938
9427434|Apr 4 2007 12:00:07:920AM|2799783|NETEVENT |3018155|510|648|4964| |MD|MD0130328
9427435|Apr 4 2007 12:00:08:290AM|2799781|NETEVENT |3022569|713|248|2027| |MD|MD0125661
9427436|Apr 4 2007 12:00:08:616AM|2799782|NETEVENT |3077955|757|345|1839| |MD|MD015546


"new1.txt" 11 lines, 926 characters
# 34  
Old 04-13-2007
what exactly is wrong with that?
Seems like
Code:
9427430|Apr 4 2007 12:00:03:613AM|7736759|NETEVENT |3146568|781|307|531| |TP|TP06173582
254

has turned into ONE line/record
Code:
9427430|Apr 4 2007 12:00:03:613AM|7736759|NETEVENT |3146568|781|307|531| |TP|TP06173582254

The comment in red is the debug comemnt that should go into the stderr.
If you followed the invocation directions in post you would see how to resolve what you're seeng now.
# 35  
Old 04-13-2007
Quote:
Originally Posted by vgersh99
what exactly is wrong with that?
Seems like
Code:
9427430|Apr 4 2007 12:00:03:613AM|7736759|NETEVENT |3146568|781|307|531| |TP|TP06173582
254

has turned into ONE line/record
Code:
9427430|Apr 4 2007 12:00:03:613AM|7736759|NETEVENT |3146568|781|307|531| |TP|TP06173582254

The comment in red is the debug comemnt that should go into the stderr.
If you followed the invocation directions in post you would see how to resolve what you're seeng now.
I'm really sorry Vgersh,May be i was just too tired to find out the difference.But thanks a lot friend.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Replace data of a file with data from another file using shell scripting.

Dears, I'm new to shell scripting and i was wondering if you can help me with following matter. I have a file containing 400,000 records. The file contains two columns like: 00611291,0270404000005453 25262597,1580401000016155 25779812,1700403000001786 00388934,1200408000000880... (1 Reply)
Discussion started by: paniklas
1 Replies

2. Shell Programming and Scripting

Extract header data from one file and combine it with data from another file

Hi, Great minds, I have some files, in fact header files, of CTD profiler, I tried a lot C programming, could not get output as I was expected, because my programming skills are very poor, finally, joined unix forum with the hope that, I may get what I want, from you people, Here I have attached... (17 Replies)
Discussion started by: nex_asp
17 Replies

3. UNIX for Dummies Questions & Answers

Mapping a data in a file and delete line in source file if data does not exist.

Hi Guys, Please help me with my problem here: I have a source file: 1212 23232 343434 ASAS1 4 3212 23232 343434 ASAS2 4 3234 23232 343434 QWQW1 4 1134 23232 343434 QWQW2 4 3212 23232 343434 QWQW3 4 and a mapping... (4 Replies)
Discussion started by: kokoro
4 Replies

4. Shell Programming and Scripting

Three Difference File Huge Data Comparison Problem.

I got three different file: Part of File 1 ARTPHDFGAA . . Part of File 2 ARTGHHYESA . . Part of File 3 ARTPOLYWEA . . (4 Replies)
Discussion started by: patrick87
4 Replies

5. Shell Programming and Scripting

Scan and change file data content problem

Input file >Read_1 XXXXXXXXXXSDFXXXXXDS (condition 1: After the last "X" per line, if the distance is less than or equal to 3 letter, replace those not "X" letter with "X") TREXXXXXXXSDFXXXXXDS (condition 2: Before the first "X" per line, if the distance is less than or equal to 3 letter,... (12 Replies)
Discussion started by: patrick87
12 Replies

6. Shell Programming and Scripting

Compare and print out data only appear in file 1 problem

Below is the data content of file_1 and file_2: file_1 >sample_1 FKGJGPOPOPOQA ASDADWEEWERE ASDAWEWQWRW ASDASDASDASDD file_2 >sample_1 DRTOWPFPOPOQA ASDADWEEASDF ASDADRTYWRW ASDASDASDASDD I got try the following perl script. Unfortunately, it can't give my desired output result... (7 Replies)
Discussion started by: patrick87
7 Replies

7. Shell Programming and Scripting

Find and replace data in text file with data in same file

OK I will do my best to explain what I need help with. I am trying to format an ldif file so I can import it into Oracle oid. I need the file to look like this example. Keep in mind there are 3000 of these in the file. changetype: modify replace: userpassword dn:... (0 Replies)
Discussion started by: timothyha22
0 Replies

8. Shell Programming and Scripting

C Shell problem: using a key from one file to find data in another

I've never written scripts (just switched from Ada to C++). I have a book that's over my head and a few examples, other then that I'm floundering. Everything here at work is being done in C Shell. None of the C++ programmers are experienced in shell scripting. I have a data file with the... (2 Replies)
Discussion started by: bassmaster
2 Replies

9. Shell Programming and Scripting

Problem in writing the data to a file in one row

Hi All I am reading data from the database and writing to temporary file in the below format. 1=XP|external_component|com.adp.meetingalertemail.processing.MeetingAlertEmail|EMAILALERTPUSH|32|4#XP |classpath|/usr/home/dfusr/lib/xalan.jar: /usr/home/dfusr/lib/xerces.jar: ... (2 Replies)
Discussion started by: rajeshorpu
2 Replies

10. Shell Programming and Scripting

Problem getting data to a report file.

Hi all, I'm trying in vain to workout how I can generate a report from a months worth of files that get created every day. There is one file per day and each daily file contain the output from a df -v command. With the following section of code ... for xdffile in $1$2/df?? do ... (4 Replies)
Discussion started by: Cameron
4 Replies
Login or Register to Ask a Question