awk to clean up input file, printing both fields


Login or Register for Dates, Times and to Reply

 
Thread Tools Search this Thread
Top Forums UNIX for Beginners Questions & Answers awk to clean up input file, printing both fields
# 1  
awk to clean up input file, printing both fields

In the f1 file below I am trying to clean it up removing lines the have _tn_ in them. Next, removing the characters in $2 before the ninth /. Then I remove the ID_(digit- always 4). Finally, the charcters after and including the first _. It is curently doing most of it but the cut is removing $1 and I'm sure there is a better way. Thank you Smilie.

f1
Code:
1112233  /xxxx/xxxx/xxxx/xxxx/yyy_yyyy_yy-yyyy-yyy-yyy_yyyy_yyyy_yyyy_yyyy_yyy_yyy_yyy_000_000/yyy/yyy/ID_1234_000000-Control_z_zzzz_zz_zz_zz_zz_zz_zzz_zz-zzzz-zzz-zzz_zzzz_zzzz_zzz_zzz_zzz_zzz_zzz.txt
1112231  /xxxx/xxxx/xxxx/xxxx/yyy_yyyy_yy-yyyy-yyy-yyy_yyyy_yyyy_yyyy_yyyy_yyy_yyy_yyy_000_000/yyy_tn_yyy/yyy/ID_1234_000000-Control_z_zzzz_zz_zz_zz_zz_zz_zzz_zz-zzzz-zzz-zzz_zzzz_zzzz_zzz_zzz_zzz_zzz_zzz.txt

current
Code:
000000-Control_z_zzzz_zz_zz_zz_zz_zz_zzz_zz-zzzz-zzz-zzz_zzzz_zzzz_zzz_zzz_zzz_zzz_zzz.txt

desired
Code:
1112231  000000-Control

Code:
sed '/_tn_/d' f1 | cut -d/ -f9 | awk '{ gsub(/ID_[0-9][0-9][0-9][0-9]_/, "", $2); print }' | cut -d_ -f1- > out

# 2  
You can include field #1 in cut
Code:
sed '/_tn_/d' f1 | cut -d/ -f1,9

This User Gave Thanks to MadeInGermany For This Post:
# 3  
This example is wrong, thanks Rudi See two posts down for a revised version, tried to use REGEX to simplify the code, does not accomplish much.
Using the sample
this code
Code:
awk -F "[/ \-]" '{
               
                  printf("%s %s\n", $1, substr( $(15),1, index($(15),"_") -1 ) ) 
                  
               }' filename

Outputs:
Code:
1112233 Control
1112231 Control


Last edited by jim mcnamara; 09-20-2019 at 08:07 PM..
This User Gave Thanks to jim mcnamara For This Post:
# 4  
Please check your post #1 for specification errors.
Try
Code:
awk '/_tn_/ {next} gsub ("^.*/|_.*$|ID_...._", "", $2)' file
1112233 000000-Control

This User Gave Thanks to RudiC For This Post:
# 5  
Corrected version
Code:
awk -F "[/ \-]" '{
                  tmp=$(14)
                  gsub("[A-Z]{2}_[0-9]{4}_", "", tmp)
                  printf("%s %s-%s\n", $1, tmp, substr( $(15),1, index($(15),"_") -1 ) ) 
                  
               }' filename

This User Gave Thanks to jim mcnamara For This Post:
# 6  
Thank you all Smilie
Login or Register for Dates, Times and to Reply

Previous Thread | Next Thread
Thread Tools Search this Thread
Search this Thread:
Advanced Search

Test Your Knowledge in Computers #543
Difficulty: Easy
A global variable can be accessed and referenced on every line of code.
True or False?

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Continued trouble matching fields in different files and selective field printing ([g]awk)

I apologize in advance, but I continue to have trouble searching for matches between two files and then printing portions of each to output in awk and would very much appreciate some help. I have data as follows: File1 PS012,002 PRQ 0 1 1 17 1 0 -1 3 2 1 2 -1 ... (7 Replies)
Discussion started by: jvoot
7 Replies

2. Shell Programming and Scripting

awk command to search based on 5 user input fields

Field1=”” Field2=”” Field3=”” Field4=”” Field5=”” USER INPUT UP TO 5 FIELDS awk -F , '{ if ( $3 == Field1 && $6 == Field2 && $8 == Field3 && $9 == Field4 && $10 == Field5) print $0 }' /tmp/rodney.outD INPUT FILE (Rodney.outD): ... (3 Replies)
Discussion started by: rmerrird
3 Replies

3. Shell Programming and Scripting

awk Selective printing of fields

Hi Gurus, I have following input file. I tried multiple awk combinations to print selected columns without success. HEX ID Name ver FLRGT Start Time Total Shared End Date ----- -------- --- ------ ------------------------ -------------- -------... (4 Replies)
Discussion started by: shunya
4 Replies

4. Shell Programming and Scripting

awk to combine all matching fields in input but only print line with largest value in specific field

In the below I am trying to use awk to match all the $13 values in input, which is tab-delimited, that are in $1 of gene which is just a single column of text. However only the line with the greatest $9 value in input needs to be printed. So in the example below all the MECP2 and LTBP1... (0 Replies)
Discussion started by: cmccabe
0 Replies

5. Shell Programming and Scripting

How to preserve spaces in input fields with awk?

I'm trying to do something pretty simple but its appears more complicated than expected... I've lines in a text file, separated by the comma and that I want to output to another file, without the first field. Input file: file1,item, 12345678 file2,item, 12345678 file2,item, ... (8 Replies)
Discussion started by: Armoric
8 Replies

6. Shell Programming and Scripting

Printing another column using awk and input data

Hi, I have data of the following type, chr1 234 678 39 852 638 abcd 7895 chr1 526 326 33 887 965 kilj 5849 Now, I would like to have something like this chr1 234 678 39 852 638 abcd 7895 <a href="http://unix.com/thread=chr1:234-678">Link</a> chr1 526 326 33 887 965 kilj 5849 <a... (5 Replies)
Discussion started by: jacobs.smith
5 Replies

7. Shell Programming and Scripting

Formatting and combining fields of the input file

Hi, I have a file of the following format: AV 103 AV 104 AV 105 AV 308 AV 517 BN 210 BN 211 BN 212 BN 218 and the desired output is : AV 103-105 3 AV 308 1 AV 517 1 BN 210-212 3 (5 Replies)
Discussion started by: rochitsharma
5 Replies

8. Shell Programming and Scripting

need help with awk in printing the fields in a file

hi all i get a file from a server and i do not know how many fields that file will contain. i need to cut from the second column of the file to the last, irrespective of how many fields are there. is there any way to make the awk command dynamic for fetching from the second to the last... (4 Replies)
Discussion started by: sais
4 Replies

9. Shell Programming and Scripting

AWK - printing certain fields when field order changes in data file

I'm hoping someone can help me on this. I have a data file that greatly simplified might look like this: sec;src;dst;proto 421;10.10.10.1;10.10.10.2;tcp 426;10.10.10.3;10.10.10.4;udp 442;10.10.10.5;10.10.10.6;tcp sec;src;fac;dst;proto 521;10.10.10.1;ab;10.10.10.2;tcp... (3 Replies)
Discussion started by: eric4
3 Replies

10. Shell Programming and Scripting

printing select fields in awk

Hi, I want to print certain fields from my data file depending on certain conditions. Somebody pls let me know how to send it to awk. The command below is the one which I want to use in a shell script and this prints fine cat ./datafile.dat | grep -i $SEARCH_STR | awk -F: '{ print $1 $2 $3... (5 Replies)
Discussion started by: maverix
5 Replies

Featured Tech Videos