Formatting file data to another file (control character related)


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Formatting file data to another file (control character related)
# 1  
Old 09-22-2012
Formatting file data to another file (control character related)

I have to write a program to read data from files and then format into another file. However, I face a strange problem related to control character that I can't understand and solve.

The source file is compose of many lines with such format:
Code:
T_NAME|P_NAME|P_CODE|DOCUMENT_PATH|REG_DATE

Expected formatted output is:
Code:
T_NAME<tab>[T_NAME]
P_NAME<tab>[P_NAME]
P_CODE<tab>[P_CODE]
PATH<tab>[DOCUMENT_PATH]
DATE<tab>[REG_DATE]
<empty line>

Code:
while read line
do
T_NAME=`echo $line| cut -d "|" -f1` 
P_NAME=`echo $line| cut -d "|" -f2` 
P_CODE=`echo $line| cut -d "|" -f3` 
PATH=`echo $line| cut -d "|" -f4` 
DATE=`echo $line| cut -d "|" -f5` 
 
echo "T_NAME\t$T_NAME" >> $target_file
echo "P_NAME\t$P_NAME" >> $target_file
echo "P_CODE\t$P_CODE" >> $target_file
echo "PATH\t$PATH" >> $target_file
echo "DATE\t$DATE" >> $target_file
echo " " >> $target_file
 
done <$source_file

The problem happens when I have such a test case in source file:
Code:
SPARK|CYBER_DEF|XRS001|abcabc\\nababa\\tc|2012-09-09 12:34:54.005

The file output:
Code:
T_NAME<tab>SPARK
ababa<tab>c
P_NAME<tab>CYBER_DEF
2012-09-09 12:34:54.005
P_CODE<tab>XRS001
PATH<tab>abcabc
DATE


I know there is some problem when cutting fields, but no idea how to fix it. Can anybody help with this?
I am using kshell.

Moderator's Comments:
Mod Comment edit by bakunin: please use CODE-tags not only for code but also for screen output, file content and similar text. Thank you.

Last edited by bakunin; 09-22-2012 at 12:17 PM..
# 2  
Old 09-22-2012
I suppose the problem being the "echo" statement. It is fed the content of the variable "$line" and interprets it. "\n" and "\t" in your example are control characters, though. "\n" is "newline", "\t" is tab.

Anyhow, your implementation isn't all too good anyway, because you use several commands and a pipeline to do what shell exansion could do too - at considerably less computing costs:

Code:
x="abc|def|ghi|jkl"
echo "${x%%|*}"
echo "${x#*|}"
y="${x#*|}"
x="${x#${y}|}"
echo "${x%%|*}"
echo "${x#*|}"

Change your code accordingly and it should not only work but run faster too.

An independent observation: DO NOT use backticks any more. They are deprecated and the shell only understands them for backward compatibility issues. Use the modern "$(...)" instead, which is a lot more flexible, can be nested, can be quoted, ...

Another hint: in case you write for Korn shell you might consider using file descriptors instead of redirection. Instead of:

Code:
command1 > "$file"
command2 >> "$file"
command3 >> "$file"
...etc.

you can write:

Code:
exec 3> "$file"     # file descriptor 3 to write into $file
# exec 3>> "$file"    # alternatively open it in append mode

print -u3 - "first line"        # -u3 is FD 3
print -u3 - "second line"
print -u3 - "third line"

exec 3>&-                 # close FD 3

I hope this helps.

bakunin

Last edited by bakunin; 09-22-2012 at 12:50 PM..
This User Gave Thanks to bakunin For This Post:
# 3  
Old 09-22-2012
Uhm.
What's wrong with:
Code:
lem@biggy:/tmp$ while IFS="|" read -r T_NAME P_NAME P_CODE _PATH _DATE; do
cat <<END >>outfile
T_NAME  $T_NAME
P_NAME  $P_NAME
P_CODE  $P_CODE
PATH    $_PATH 
DATE    $_DATE

END
    done <infile
lem@biggy:/tmp$ cat outfile
T_NAME  SPARK
P_NAME  CYBER_DEF
P_CODE  XRS001
PATH    abcabc\\nababa\\tc
DATE    2012-09-09 12:34:54.005

lem@biggy:/tmp$

Or, according with your preferences:
Code:
lem@biggy:/tmp$ while IFS="|" read T_NAME P_NAME P_CODE _PATH _DATE; do
cat <<END >>outfile
T_NAME  $T_NAME
P_NAME  $P_NAME
P_CODE  $P_CODE
PATH    $_PATH
DATE    $_DATE

END
     done <infile
lem@biggy:/tmp$ cat outfile
T_NAME  SPARK
P_NAME  CYBER_DEF
P_CODE  XRS001
PATH    abcabc\nababa\tc
DATE    2012-09-09 12:34:54.005

lem@biggy:/tmp$

--
Bye
This User Gave Thanks to Lem For This Post:
# 4  
Old 09-24-2012
Code:
x="abc|def|ghi|jkl"
echo "${x%%|*}"
echo "${x#*|}"
y="${x#*|}"
x="${x#${y}|}"
echo "${x%%|*}"
echo "${x#*|}"

The above code give this output:

Code:
abc|def|ghi|jkl

abc|def|ghi|jkl

I don't know if the code is incorrect or my working environment does not support shell expansion.
Anyway, using file descriptors instead of redirection is new to me.

Thanks to Lem, the lower one is what I want.
Thanks to bakunin and Lem for quick reply.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Formatting data to put it in the excel file

Hello, I have a file with the below contents : Policy Name: Backup_bkp Policy Type: Catalog_bkp Active: yes Effective date: 08/07/2013 02:02:12 Mult. Data Streams: no Client Encrypt: no Checkpoint: no Policy Priority: ... (11 Replies)
Discussion started by: rahul2662
11 Replies

2. UNIX for Dummies Questions & Answers

Formatting data in a raw file by using another mapping file

Hi All, i have a requirement where i need to format the input RAW file ( which is CSV) by using another mapping file(also CSV file). basically i am getting feed file with dynamic headers by using mapping file (in that target field is mapped with source filed) i have to convert the raw file into... (6 Replies)
Discussion started by: ravi4informatic
6 Replies

3. Shell Programming and Scripting

control M character in unix file

in a file we are getting control character in a file , is there any way that they can be removed once we have the file for eg. BEGIN-PROCEDURE INITIALIZE ^M LET #row_count = 0^M ^M ^M (2 Replies)
Discussion started by: lalitpct
2 Replies

4. Shell Programming and Scripting

perl cmd to remove the control-Z character at end of 10GB file

In a 10-50GB file , at end of file there is Control-z character tried the below options, 1. perl -p -i -e 's/^Z//g' new.txt 2. perl -0777lwi -032e0 new.txt and Sed command, dos2unix etc it takes more time to remove the control-z. need a command or perl program to GO TO LAST LINE OF FILE ... (7 Replies)
Discussion started by: prsam
7 Replies

5. Programming

Problem with control file and special character

I am getting error when loading data file using ctl file. I get this error only when there is special character. Below is some data. DataFile=> company_id|ciu_id|english_name|iso_country_code|active|partner_name 1-2JT-122||Expert Järvenpää|FI|A|Expert Järvenpää Control File=> LOAD DATA... (1 Reply)
Discussion started by: rshivarkar
1 Replies

6. Shell Programming and Scripting

Extracting specific lines of data from a file and related lines of data based on a grep value range?

Hi, I have one file, say file 1, that has data like below where 19900107 is the date, 19900107 12 144 129 0.7380047 19900108 12 168 129 0.3149017 19900109 12 192 129 3.2766666E-02 ... (3 Replies)
Discussion started by: Wynner
3 Replies

7. Shell Programming and Scripting

formatting data file with awk or sed

Hi, I have a (quite large) data file which looks like: _____________ header part.. more header part.. x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 x11 x12 x13 ... ... x59 x60 y1 y2 y3 y4... ... y100 ______________ where x1, x2,...,x60 and y1, y2,...y100 are numbers of 10 digits (so each line... (5 Replies)
Discussion started by: lego
5 Replies

8. UNIX for Dummies Questions & Answers

How to find the ^M(control M) character in unix file?

can any one say about command to find "^M" (Control M)characters in a unix text file. ^M comes when a file ftped from windows to unix without using bin mode. I need the command to find lik this, ex.txt: ------------------------------ ...,name,time^M go^M ...file,end^M... (5 Replies)
Discussion started by: prsam
5 Replies

9. Shell Programming and Scripting

Checking for a control file before processing a data file

Hi All, I am very new to Shell scripting... I got a requirement. I will have few text files(data files) in a particular directory. they will be with .txt extension. With same name, but with a different extension control files also will be there. For example, Sample_20081001.txt is the data... (4 Replies)
Discussion started by: purna.cherukuri
4 Replies

10. UNIX for Dummies Questions & Answers

Control character in a file

Hi All, I am looking for a solution to capture any ASCII control character in a file ( where the ASCII control character is in decimal value from 0 to 31 and 127 ( Hex value from 00 to 1F and 7F ) ) by returning any affected lines. The intended good file should contain "ASCII printable... (5 Replies)
Discussion started by: cursive
5 Replies
Login or Register to Ask a Question