Delimit file based on character length using awk Post: 302967384

Sponsored Content

Top Forums Shell Programming and Scripting Delimit file based on character length using awk Post 302967384 by RavinderSingh13 on Tuesday 23rd of February 2016 10:12:30 AM

02-23-2016

Moderator

Hello Prathmesh,

Could you please go through following and let me know if this helps you.

Code:

  
awk 'FNR==NR{                                         ####### This condition will be TRUE only when first file is being read, because FNR will be RESET for each file but NR(Number of recoreds) value will be keep on increasing till last file read.
A[++i]=$1;                                            ####### Once above condition is TRUE then I am creating an array named A whose index is a variable named i, ++i means increse value of variable i and keep it's value same as $1's(first field's) value.
B[i]=$2;                                              ####### Creating an array named B whose index is variable i(note but not increasing the value of variable i here, to keep the same indexes for array A and B). keeping it's value to $2's value which is second field's value.
next}                                                 ####### putting next statment here to skip further all the next actions now.
{for(j=1;j<=i;j++){                                   ####### Now starting a for loop to run it till the value of variable i, which we will get variable i's final value when first file will be completly read.
if(B[j]){                                             ####### Here I am making sure array B's value is NOT NULL(because in your example at last line last field is empty so during next step doing substr I have to check this condition now.
C=C?C OFS substr($0,A[j],B[j]):substr($0,A[j],B[j])}  ####### Creating a variable named C whose value will appended each time with it's own last time value along with the current line's substring's value(Here I am using array A and array B to get the substring where obvioslu array A is for the starting position and array B denotes then length of string.
else {                                                ####### If above condition is NOT true then this else will be executed.
C=C?C OFS substr($0,A[j]):substr($0,A[j])}};          ####### create a variable named C and each time append itself with variable C with it's current line's substring's value. Here difference between the previous substring and now substring is I am not giving the till value eg--> substr(LINE, STARTING point, END Point); because we may have NO END point like your 3rd line in fields file.
print C;                                              ####### printing the variable named C.
C=""}'                                                ####### Nullyfing the variable C.
FS="," fields                                         ####### Mentioning the field seprator for fields file as comma here. NOTE it will not be for second file, awk gives us this facility to set mutiple field seprators for different files according to our requirements.
OFS="|" main_file                                     #######  Mentioning the output field seprator as | here and mentioning Input_file(main_file) here too.

Thanks,
R. Singh

This User Gave Thanks to RavinderSingh13 For This Post:

RavinderSingh13

View Public Profile for RavinderSingh13

Find all posts by RavinderSingh13

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Need find a file based length

Can some please help me? Want to find files over 35 characters in length? I am running HPUX. Would it be possible with find? Thanks in advance

2. Shell Programming and Scripting

Using Awk script to check length of a character

Hi All , I am trying to build a script using awk that checks columns of the �nput file and displays message if the column length exceeds 35 char. i have tried the below code but it does not work properly

3. Shell Programming and Scripting

print a file with one column having fixed character length

Hi guys, I have tried to find a solution for this problem but couln't. If anyone of you have an Idea do help me. INPUT_FILE with three columns shown to be separated by - sign A5BNK723NVI - 1 - 294 A7QZM0VIT - 251 - 537 A7NU3411V - 245 - 527 I want an output file in which First column...

4. Shell Programming and Scripting

Add character based on record length

All, I can't seem to find exactly what I'm looking for, and haven't had any luck patching things together. I need to look through a file, and if the record length is not 874, then add 'E' in position 778. Your help is greatly appreciated.

5. Shell Programming and Scripting

Generate 100 Character Fixed Length Empty File

Hello Everyone, I'm running AIX 5.3 and need to generate a 100 character fixed length empty file from within a bash script that I am developing. I searched and was able to find: dd if=/dev/zero of=/test/path/file count=100 however my understanding is that this will generate a file of a...

6. Shell Programming and Scripting

Awk: Searching for length of words between slash character

Dear UNIX Community, I have a set of file paths like the one below: \\folder name \ folder1 \ folder2 \ folder3 \ folder4 \\folder name \ very long folder name \ even longer name I would like to find the length of the characters (including space) between the \'s. However, I want...

7. Shell Programming and Scripting

File character adjustment based on specific character

i have a reqirement to adjust the data in a file based on a perticular character the sample data is as below 483PDEAN CORRIGAN 52304037528955WAGES 50000 89BP ABCD MASTER352 5434604223735428 4200 58BP SOUTHERN WA848 ...

8. Shell Programming and Scripting

awk based script to ignore all columns from a file which contains character strings

Hello All, I have a .CSV file where I expect all numeric data in all the columns other than column headers. But sometimes I get the files (result of statistics computation by other persons) like below( sample data) SNO,Data1,Data2,Data3 1,2,3,4 2,3,4,SOME STRING 3,4,Inf,5 4,5,4,4 I...

9. UNIX for Dummies Questions & Answers

Select lines based on character length

Hi, I've got a file like this: 22 22:35645163:T:<CN0>:0 0 35645163 T <CN0> 22 rs140738445:20902439:TTTTTTTG:T 0 20902439 T TTTTTTTG 22 rs149602065:40537763:TTTTTTG:T 0 40537763 T TTTTTTG 22 rs71670155:50538408:TTTTTTG:T 0 50538408 T TTTTTTG...

10. Shell Programming and Scripting

Add string based on character length

Good day, I am a newbie here and thanks for accepting me I have a task to modify input data where my input data looks like 123|34567|CHINE 1|23|INDIA 34512|21|USA 104|901|INDIASee that my input has two columns with different character length but max length is 5 and minimum length is 0 which...

LEARN ABOUT DEBIAN

plan9-join

JOIN(1) 						      General Commands Manual							   JOIN(1)

NAME

       join - relational database operator

SYNOPSIS

       join [ options ] file1 file2

DESCRIPTION

       Join forms, on the standard output, a join of the two relations specified by the lines of file1 and file2.  If one of the file names is the
       standard input is used.

       File1 and file2 must be sorted in increasing ASCII collating sequence on the fields on which they are to be joined, normally the  first	in
       each line.

       There  is  one line in the output for each pair of lines in file1 and file2 that have identical join fields.  The output line normally con-
       sists of the common field, then the rest of the line from file1, then the rest of the line from file2.

       Input fields are normally separated spaces or tabs; output fields by space.  In this case, multiple separators count as	one,  and  leading
       separators are discarded.

       The following options are recognized, with POSIX syntax.

       -a n   In addition to the normal output, produce a line for each unpairable line in file n, where n is 1 or 2.

       -v n   Like -a, omitting output for paired lines.

       -e s   Replace empty output fields by string s.

       -1 m
       -2 m   Join on the mth field of file1 or file2.

       -jn m  Archaic equivalent for -n m.

       -ofields
	      Each  output  line  comprises the designated fields.  The comma-separated field designators are either 0, meaning the join field, or
	      have the form n.m, where n is a file number and m is a field number.  Archaic usage allows separate arguments for field designators.

       -tc    Use character c as the only separator (tab character) on input and output.  Every appearance of c in a line is significant.

EXAMPLES

       sort /etc/passwd | join -t: -1 1 -a 1 -e "" - bdays
	      Add birthdays to the /etc/passwd file, leaving unknown birthdays empty.  The layout of /adm/users is given in passwd(5); bdays  con-
	      tains sorted lines like

       tr : ' ' </etc/passwd | sort -k 3 3 >temp
       join -1 3 -2 3 -o 1.1,2.1 temp temp | awk '$1 < $2'
	      Print all pairs of users with identical userids.

SOURCE

       /src/cmd/join.c

SEE ALSO

       sort(1), comm(1), awk(1)

BUGS

       With default field separation, the collating sequence is that of sort -b -ky,y; with -t, the sequence is that of sort -tx -ky,y.

       One of the files must be randomly accessible.

																	   JOIN(1)

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Need find a file based length

Discussion started by: J_ang

2. Shell Programming and Scripting

Using Awk script to check length of a character

Discussion started by: amit1_x

3. Shell Programming and Scripting

print a file with one column having fixed character length

Discussion started by: smriti_shridhar

4. Shell Programming and Scripting

Add character based on record length

Discussion started by: CutNPaste