Sponsored Content
Top Forums Shell Programming and Scripting Selecting Specific Columns and Insert the delimiter TAB Post 302547008 by filter on Saturday 13th of August 2011 04:16:07 AM
Old 08-13-2011
Selecting Specific Columns and Insert the delimiter TAB

Hi,
I am writing a Perl Script for the below :

I have a data file that consists of the header information which is 231 Lines and the footer information as 4 lines. The total number of line including the header and footer 1.2 Million with Pipe Delimited file.

For example:
Header Information:
Quote:
START-OF-FILE
FILENAME=fixedincome_bo_euro.out
DATA=bo
REGION=euro
TYPE=out
PROGRAMNAME=getdata
DATEFORMAT=yyyymmdd

... so on 231 Lines
Footer Information:
Quote:
END-OF-DATA
DATARECORDS=1221264
TIMEFINISHED=Fri Aug 12 18:57:09 BST 2011
END-OF-FILE
Data looks like:
Each line has around ~210 columns and is Pipe delimited.
Quote:
TT3069982 Corp|0|198|FSPIN|4.000000| | |FINE SPINNERS|FINE SPIN-CALLED|INDUSTRIAL|Corp|2|FIXED|PERP/CALL|PERPETL PAY,EX-DIV|3|DOMESTIC|EN|GBP|MORTGAGE BACKED|2000000.00|.00|1.0000|1.0000|1.00| |NOT LISTED|100.00000| | |N.A.|N.A.| |100.000000| | | | | | | | | | | | | | | | | | | | | |234953|500000|TT3069982| | | | | | | | | |N.A.| | | | | | | | | | | | |Y|N|N| | | |GB| |Basic Materials|Chemicals|Chemicals-Fibers|N.A.|GB|FSPIN 4 03/29/49|N| |DOMESTIC| |N.A.| | |N| |N|COTT3069982|Fine Spinners|GBP|GBP|N|N|Y|1|N|N|GBP|N|N|Y|19920228|FINE SPINNERS|Anytime| |N.A.| | |N|N|EN|EN|Does Not Apply|20490329|N|42| |Y|N|100.000000|N|20110820|.000000000| |N| | | | |N.A.|N.A.|N.A.|N.A.|N.A.| | | | | | |N|N|N|N| |Grandfathered| |2| | |N.A.|N| | |N| | | | |N| | |20490329| | |N|N|N| | |N|3| | | |N.A.|2| |41|CALENDAR| |N|N|BBG00035Y4Y1|
The outfile should contain the lines with only specific Columns and should be TAB delimited.
Specific Columns:
Quote:
3 4 5-7 10 11 12 13 15 16-19 20-24 25-26 27 28-32 33 36 37 40 55-58 59 60
61 62 63-66 68 69-72 73 74-75 76 77 78-79 80-86 87 88-94 95 96-99 100 101-103 105-107 109-110 112-123 125-128 130-131 133-135 137 111 124 132 136 187 Only.
So I have started writing the Perl script:

Quote:
#!/usr/bin/perl
$file='fileA';
open(F,$file)|| die ("could not open file $file: $!");
@array = <F>;
close F;
open(OUT,'>','outfile');
print OUT @array[231..$#array-4];
close OUT;
I am using array spice to eliminate the Header and footer information..Please correct me if I am wrong.

Now, Once I load the file into an array, how do I select the above selected columns and then insert the delimiter as TAB in Perl.

Would that be easier if I use hashes or array ?

Could someone Please help me out in this. Really appreciate your thoughts.
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Cutting a tab delimiter file

I have a 30 column tab delimited record file. I need to extract the first 10column. The following command to cut was not working cut -f 1-10 -d "\t" filename. Could any one keep on this . Thanks in Advance (4 Replies)
Discussion started by: vinod.thayil
4 Replies

2. Shell Programming and Scripting

how to differentiate columns of a file in perl with no specific delimiter

Hi everybody, This time I am having one issue in perl. I have to create comma separated file using the following type of information. The problem is the columns do not have any specific delimiter. So while using split I am getting different value. Some where it is space(S) and some where it is... (9 Replies)
Discussion started by: Amiya Rath
9 Replies

3. Shell Programming and Scripting

append data in a file by using tab delimiter

Hi, I need to append the data in to a file by using tab delimiter. eg: echo "Data1" >> filename.txt echo "\t" >> filename.txt (its not working) echo "Data2" >> filename.txt. the result sould be like this. Data1 Data2 (6 Replies)
Discussion started by: Sharmila_P
6 Replies

4. UNIX for Advanced & Expert Users

how to search delimiter tab in a line and replace it

hi every one plz help me i want to search for a line contains tabspace This is a line The should be changed see the above line is seperated with tab space i want to replace that tab space in to # as This is a line#The should be changed i have tried with... (4 Replies)
Discussion started by: kkraja
4 Replies

5. UNIX for Dummies Questions & Answers

Delimiter: Tab or Space?

Hello, Is there a direct command to check if the delimiter in your file is a tab or a space? And how can they be converted from one to another. Thanks, G (4 Replies)
Discussion started by: Gussifinknottle
4 Replies

6. Shell Programming and Scripting

Selecting specific 'id's from lines and columns using 'SED' or 'AWK'

Hello experts, I am new to this group and to 'SED' and 'AWK'. I have data (text file) with 5 columns (C_1-5) and 100s of lines (only 10 lines are shown below as an example). I have to find or select only the id numbers (C-1) of specific lines with '90' in the same line (of C_3) AND with '20' in... (6 Replies)
Discussion started by: kamskamu
6 Replies

7. UNIX for Dummies Questions & Answers

Making a Tab delimiter file to Comma

How can i make a tab delimiter file to a comma delimiter??? (13 Replies)
Discussion started by: saggiboy10
13 Replies

8. Shell Programming and Scripting

Compare two tab-delimiter files

Hi, I have two files like: file1 chr1 40 chr1 50 chr2 10 chr2 60 file2 chr1 30 chr1 50 chr2 15 chr2 20 and want to get the difference of column 2 when column 1 is the same in both files. (4 Replies)
Discussion started by: linseyr
4 Replies

9. Shell Programming and Scripting

Insert space in specific column among many columns

Hello, I have some problem in inserting the space for the pairs of columns. I have the input file : I used this code below in replacing it using space in specific column (replace space in each two columns) sed -e "s/,/ /2" -e "s/,/ /3" inputfile Output showed : However, I have many... (3 Replies)
Discussion started by: awil
3 Replies

10. Shell Programming and Scripting

Delete and insert columns in a tab delimited file

Hi all , I have a file having 12 columns tab delimited . I need to read this file and remove the column 3 and column 4 and insert a word in column 3 as "AVIALABLE " Is there a way to do this . I am trying like below Thanks DJ cat $FILENAME|awk -F"\t" '{ print $1 "\t... (3 Replies)
Discussion started by: Hypesslearner
3 Replies
ASCII(7)						     Linux Programmer's Manual							  ASCII(7)

NAME
ascii - the ASCII character set encoded in octal, decimal, and hexadecimal DESCRIPTION
ASCII is the American Standard Code for Information Interchange. It is a 7-bit code. Many 8-bit codes (such as ISO 8859-1, the Linux default character set) contain ASCII as their lower half. The international counterpart of ASCII is known as ISO 646. The following table contains the 128 ASCII characters. C program 'X' escapes are noted. Oct Dec Hex Char Oct Dec Hex Char ------------------------------------------------------------ 000 0 00 NUL '' 100 64 40 @ 001 1 01 SOH 101 65 41 A 002 2 02 STX 102 66 42 B 003 3 03 ETX 103 67 43 C 004 4 04 EOT 104 68 44 D 005 5 05 ENQ 105 69 45 E 006 6 06 ACK 106 70 46 F 007 7 07 BEL 'a' 107 71 47 G 010 8 08 BS '' 110 72 48 H 011 9 09 HT ' ' 111 73 49 I 012 10 0A LF ' ' 112 74 4A J 013 11 0B VT 'v' 113 75 4B K 014 12 0C FF 'f' 114 76 4C L 015 13 0D CR ' ' 115 77 4D M 016 14 0E SO 116 78 4E N 017 15 0F SI 117 79 4F O 020 16 10 DLE 120 80 50 P 021 17 11 DC1 121 81 51 Q 022 18 12 DC2 122 82 52 R 023 19 13 DC3 123 83 53 S 024 20 14 DC4 124 84 54 T 025 21 15 NAK 125 85 55 U 026 22 16 SYN 126 86 56 V 027 23 17 ETB 127 87 57 W 030 24 18 CAN 130 88 58 X 031 25 19 EM 131 89 59 Y 032 26 1A SUB 132 90 5A Z 033 27 1B ESC 133 91 5B [ 034 28 1C FS 134 92 5C '\' 035 29 1D GS 135 93 5D ] 036 30 1E RS 136 94 5E ^ 037 31 1F US 137 95 5F _ 040 32 20 SPACE 140 96 60 ` 041 33 21 ! 141 97 61 a 042 34 22 " 142 98 62 b 043 35 23 # 143 99 63 c 044 36 24 $ 144 100 64 d 045 37 25 % 145 101 65 e 046 38 26 & 146 102 66 f 047 39 27 ' 147 103 67 g 050 40 28 ( 150 104 68 h 051 41 29 ) 151 105 69 i 052 42 2A * 152 106 6A j 053 43 2B + 153 107 6B k 054 44 2C , 154 108 6C l 055 45 2D - 155 109 6D m 056 46 2E . 156 110 6E n 057 47 2F / 157 111 6F o 060 48 30 0 160 112 70 p 061 49 31 1 161 113 71 q 062 50 32 2 162 114 72 r 063 51 33 3 163 115 73 s 064 52 34 4 164 116 74 t 065 53 35 5 165 117 75 u 066 54 36 6 166 118 76 v 067 55 37 7 167 119 77 w 070 56 38 8 170 120 78 x 071 57 39 9 171 121 79 y 072 58 3A : 172 122 7A z 073 59 3B ; 173 123 7B { 074 60 3C < 174 124 7C | 075 61 3D = 175 125 7D } 076 62 3E > 176 126 7E ~ 077 63 3F ? 177 127 7F DEL HISTORY
An ascii manual page appeared in Version 7 AT&T UNIX. On older terminals, the underscore code is displayed as a left arrow, called backarrow, the caret is displayed as an up-arrow and the ver- tical bar has a hole in the middle. Uppercase and lowercase characters differ by just one bit and the ASCII character 2 differs from the double quote by just one bit, too. That made it much easier to encode characters mechanically or with a non-microcontroller-based electronic keyboard and that pairing was found on old teletypes. The ASCII standard was published by the United States of America Standards Institute (USASI) in 1968. SEE ALSO
iso_8859_1(7), iso_8859_15(7), iso_8859_7(7) Linux 1999-08-08 ASCII(7)
All times are GMT -4. The time now is 01:20 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy