Parsing out data with multiple field separators


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Parsing out data with multiple field separators
# 1  
Old 10-17-2017
Parsing out data with multiple field separators

I have a large file that I need to print certain sections out of.

file.txt
Code:
/alpha/beta/delta/gamma/425/590/USC00015420.blah.lt.0.01.str:USC00015420Y2017M10BLALT.01   12   13   14   -9    1   -9   -9   -9   -9   -9    1    2    3    4    5   -9   -9

I need to print the "USC00015420" and field 16 (in this case "5").

The code I was hoping would work was
Code:
 awk -F" " '{ printf substr($1,53,57)};{print ("   "$16)}' file.txt

But it prints all the content within the periods (".") .

I need the final output to be
Code:
USC00015420   5

# 2  
Old 10-17-2017
This seems to work:
Code:
awk '{ print substr($1,62,11) "   " $16}' file.txt

The third parameter in substr is the number of characters that you want to print, not the position.

Alternatives based on field separators:
Code:
awk '{ split($1,F,".*/|\\."); print F[2] "   " $16}' file.txt

or
Code:
awk '{ split($1,F,/.*\/|\./); print F[2] "   " $16}' file.txt

or
Code:
awk '{ s=$1; gsub(".*/|\\..*",x,s); print s "   " $16}' file.txt


Last edited by Scrutinizer; 10-17-2017 at 01:05 PM..
This User Gave Thanks to Scrutinizer For This Post:
# 3  
Old 10-17-2017
Hello ncwxpanther,

Could you please try following too and let me know if this helps you.
Solution 1st:
Code:
awk '{split($1,a,"/");sub(/\..*/,"",a[8]);print a[8]"\t"$16}'   Input_file

Solution 2nd:
Code:
awk -F'[/. ]' '{print $8"\t"$65}'   Input_file

Thanks,
R. Singh
# 4  
Old 10-17-2017
Quote:
Originally Posted by ncwxpanther
I have a large file that I need to print certain sections out of.
The basic problem is that you need to define what a "section" is: the line you presented uses different separators to delimit what i suppose is a "section" in your wording: spaces (or maybe tabs), dots, slashes and colons.

Will these four (five) characters always be delimiters?

Will the lines always have the same structure? (like "First 7 parts separated by "/", the seventh part consists of 6 parts separated by ".", then a colon, then ....")

If you could answer these questions we could perhaps provide better solutions which might work better. Without this information we will never be sure to have really solved the problem.

I hope this helps.

bakunin
# 5  
Old 10-17-2017
Quote:
Originally Posted by Scrutinizer
This seems to work:
Code:
awk '{ print substr($1,62,11) "   " $16}' file.txt

The third parameter in substr is the number of characters that you want to print, not the position.
Thanks!
I was able to get this to perform as expected
Code:
awk '{ print substr($1,53,11) "   " $16}' file.txt


Last edited by RudiC; 10-17-2017 at 03:10 PM.. Reason: Corrected the QUOTE tags
# 6  
Old 10-17-2017
Certainly NOT with the sample you gave in post#1
Quote:
Originally Posted by ncwxpanther
.
.
.
Code:
/alpha/beta/delta/gamma/425/590/USC00015420.blah.lt.0.01.str:USC00015420Y2017M10BLALT.01   12   13   14   -9    1   -9   -9   -9   -9   -9    1    2    3    4    5   -9   -9

.
.
.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Extract lines with min value, using two field separators.

I have a file with two ID columns followed by five columns of counts in fraction form. I'd like to print lines that have a count of at least 4 (so at least 4 in the numerator, e.g. 4/17) in at least one of the five columns. Input file: comp51820_c1_seq1 693 0/29 0/50 0/69 0/36 0/31... (6 Replies)
Discussion started by: pathunkathunk
6 Replies

2. Shell Programming and Scripting

Multiple long field separators

How do I use multiple field separators in awk? I know that if I use awk -F"", both a and b will be field separators. But what if I need two field separators that both are longer than one letter? If I want the field separators to be "ab" and "cd", I will not be able to use awk -F"". The ... (2 Replies)
Discussion started by: locoroco
2 Replies

3. UNIX for Dummies Questions & Answers

Can one use 2 field separators in awk?

I have files such as n02-z30-dsr65-terr0.25-dc0.008-16x12drw-run1.cmd I am wondering if it is possible to define two field separators "-" and "." for these strings so that $7 is run1. (5 Replies)
Discussion started by: kristinu
5 Replies

4. UNIX Desktop Questions & Answers

awk Varing Field Separators

Hi Guys, I have small dilemma which I could do with a little help solving . I currently have text HDD S.M.A.R.T report which I have pasted below: smartctl 5.39 2008-10-24 22:33 (openSUSE RPM) Copyright (C) 2002-8 by Bruce Allen, http://smartmontools.sourceforge.net Device: COMPAQ... (2 Replies)
Discussion started by: bikerben
2 Replies

5. Shell Programming and Scripting

Problem with changing field separators in a file

I have a file with content as shown below. cat t2 : 100,100,"X",1234,"12A",,,"ab,c" Comma is the field seperator, however string fields will be within double quotes and comma within double quotes should not be treated as field seperator. I am trying to replace this field seperator to a... (7 Replies)
Discussion started by: mk1216
7 Replies

6. Shell Programming and Scripting

Fixed width file with newline field separators

I have some huge files that are produced daily from a production system written in basic (really). The files are fixed width records, 512 bytes, with newline field separators, newlines if the field is null, and trailing newlines for null fields. The data in the fields can be any ascii... (0 Replies)
Discussion started by: vtischuk@yahoo.
0 Replies

7. UNIX for Dummies Questions & Answers

Multiple field separators in awk? (First a space, then a colon)

How do I deal with extracting a portion of a record when multiple field separators are involved. Let's say I have: Mike Harrington;(555) 555-5555:250:100:175 Christian Dobbins;(555) 555-2358:155:90:201 Susan Dalsass;(555) 555-6279:250:60:50 Archie McNichol;(555) 555-1348:250:100:175 Jody... (3 Replies)
Discussion started by: doubleminus
3 Replies

8. Shell Programming and Scripting

Multiple input field Separators in awk.

I saw a couple of posts here referencing how to handle more than one input field separator in awk. I figured I would share how I (just!) figured out how to turn this line in a logfile: 90000000000000000000010001 name... (4 Replies)
Discussion started by: kinksville
4 Replies

9. Shell Programming and Scripting

I need help counting the fields and field separators using Nawk

I need help counting the fields and field separators using Nawk. I have a file that has multiple lines on it and I need to read the file 1 at a time and then count the fields and field separators and then store those numbers in variables. I then need to delete the first 5 fields and the blank... (3 Replies)
Discussion started by: scrappycc
3 Replies

10. Shell Programming and Scripting

Awk Multiple Field Separators

Hi Guys, I'm tying to split a line similar to this:YO6-2000-30.htm: (3 properties found).......into separate columns, so effectively I need to check for a -, ., :, a tab and a space in the statement. Any help would be appreciated Thanks! (7 Replies)
Discussion started by: Tonka52
7 Replies
Login or Register to Ask a Question