Extracting formatted text and numbers


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Extracting formatted text and numbers
# 1  
Old 03-22-2010
Extracting formatted text and numbers

Hello,
I have a file of text and numbers from which I want to extract certain fields and write it to a new file. I would use awk but unfortunately the input data isn't always formatted into the correct columns. I am using tcsh.

For example, given the following data
Quote:
1964.01.17 121427.6 24.268s 177.074w 158.0km 4.8 Tonga 1
ep :aang= -18.5 90% : l=267.9,az= -18.5,pl=35.9
ellipse: a= 188.6 ellipsoid: l= 59.3,az= 158.6,pl=54.0
(90% ) : b= 25.2 semi-axis: l= 29.4,az=-109.5,pl= 1.4
1964.01.18 41437.6 30.000s 177.900w 48.0km 4.8 Tonga
I want to extract:
Quote:
1964.01.17, 24.268s, 177.074w, 158.0km, 4.8, aang = -18.5, a= 188.6, b= 25.2
and print it in one line of a new file as:
Quote:
1964.01.17 24.268 177.074 158.0 4.8 -18.5 188.6 25.2
and extract
Quote:
1964.01.18, 41437.6, 30.000s, 177.900w , 48.0km, 4.8
and print it as:
Quote:
1964.01.18 41437.6 30.000 177.900 48.0 4.8 0 0 0
Does anyone have any ideas as to what command I Should use?
Thanks,
Dan
# 2  
Old 03-22-2010
Question

Do all of the records begin
1964.01
or something similar?
# 3  
Old 03-22-2010
Yes,
Every record that I want to extract to a new line in the output file starts with a date in the format YYYY.MM.DD. However, not all records that I want have the following lines with aang, a, and b.
# 4  
Old 03-22-2010
Hello, DFr0st:

Your sample data in a file called data:
Code:
$ cat data
1964.01.17 121427.6 24.268s 177.074w 158.0km 4.8 Tonga 1
ep :aang= -18.5 90% : l=267.9,az= -18.5,pl=35.9
ellipse: a= 188.6 ellipsoid: l= 59.3,az= 158.6,pl=54.0
(90% ) : b= 25.2 semi-axis: l= 29.4,az=-109.5,pl= 1.4
1964.01.18 41437.6 30.000s 177.900w 48.0km 4.8 Tonga

Passing it through a tr filter to remove all characters that are not a decimal digit, decimal point, minus sign, space, or newline yields:
Code:
$ tr -cd '[-0-9. \n]' < data
1964.01.17 121427.6 24.268 177.074 158.0 4.8  1
  -18.5 90  267.9 -18.535.9
  188.6   59.3 158.654.0
90    25.2 -  29.4-109.5 1.4
1964.01.18 41437.6 30.000 177.900 48.0 4.8

Assuming that your sample data is representative of the two forms of records you mentioned, and that you did not neglect to mention any special cases, a line with 6 numbers is a single line record that only requires appending three zeros; a line with 7 numbers is the start of a multiline record and is followed by three lines of 4, 3, and 5 fields respectively (all field counts are after tr filtering). The following AWK assembles what remains into what's desired, before passing it through another tr filter to squeeze mulitple spaces into a single space:
Code:
$ tr -cd '[-0-9. \n]' < data | awk 'NF==6{print $0,0,0,0} NF==7{$2=""; $NF=""; s=$0; getline; d=$1; getline; a=$1; getline; print s,d,a,$2}' | tr -s ' '
1964.01.17 24.268 177.074 158.0 4.8 -18.5 188.6 25.2
1964.01.18 41437.6 30.000 177.900 48.0 4.8 0 0 0

Regards,
Alister

Last edited by alister; 03-22-2010 at 10:56 PM..
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Sum up formatted numbers with comma separation

I need to sum up the values in field nr 5 in a data file that contains some file listing. The 5th field denotes the size of each file and following are some sample values. 1,775,947,633 4,738 7,300 16,610 15,279 0 0 I tried the following code in a shell script. awk '{sum+=$5} END{print... (4 Replies)
Discussion started by: krishmaths
4 Replies

2. Shell Programming and Scripting

Extracting values based on line-column numbers from multiple text files

Dear All, I have to solve the following problems with multiple tab-separated text file but I don't know how. Any help would be greatly appreciated. I have access to Linux mint (but not as a professional). I have multiple tab-delimited files with the following structure: file1: 1 44 2 ... (5 Replies)
Discussion started by: Bastami
5 Replies

3. Shell Programming and Scripting

Split files with formatted numbers

How to split the file and have suffix with formatted numbers Tried the following code awk '{filename="split."int((NR-1)/2)".txt"; print >> filename}' split.txt Current Result Expected Result (21 Replies)
Discussion started by: bobbygsk
21 Replies

4. Shell Programming and Scripting

Extracting numbers

Hi I am part of a academic organization and I want to send a fax to the students however there must be a quicker way to get the fax numbers extracted from the online forms they sent me. The file looks like this (numbers are fake in order to protect identity): Biochemistry Major Michael... (3 Replies)
Discussion started by: phil_heath
3 Replies

5. Shell Programming and Scripting

Extracting lines from text files in folder based on the numbers in another file

Hello, I have a file ff.txt that looks as follows *ABNA.txt 356 24 36 112 *AC24.txt 457 458 321 2 ABNA.txt and AC24.txt are the files in the folder named foo1. Based on the numbers in the ff.txt file, I want to extract the lines from the corresponding files in the foo1 folder and... (2 Replies)
Discussion started by: mohamad
2 Replies

6. UNIX for Dummies Questions & Answers

Extracting lines from a text file based on another text file with line numbers

Hi, I am trying to extract lines from a text file given a text file containing line numbers to be extracted from the first file. How do I go about doing this? Thanks! (1 Reply)
Discussion started by: evelibertine
1 Replies

7. UNIX for Dummies Questions & Answers

Print numbers and associated text belonging to an interval of numbers

##### (0 Replies)
Discussion started by: lucasvs
0 Replies

8. Shell Programming and Scripting

print formatted text to the printer

Hello!!! I am using shell script that print some formated text on the screen (example) ======== hello I am ... ======== Is it possible to print this information to the printer exactly as I see it on the screen??? (6 Replies)
Discussion started by: tdev457
6 Replies

9. Shell Programming and Scripting

Script or command: Formatted text to CSV

Hi Everyone, I've been using this site as a great resource to aid me with simple search and replace tasks. I still consider myself a novice and now I've been pulling my hair out over this problem. Any hints or suggestions would be welcome! I have a text file in a format like this name:... (6 Replies)
Discussion started by: regexnub
6 Replies

10. Shell Programming and Scripting

Convert DATE string to a formatted text

Hi guys, i need your help. I need to convert a date like this one 20071003071023 , to a formated date like 20071003 07:10:23 . Could this be possible ? Regards, Osramos (6 Replies)
Discussion started by: osramos
6 Replies
Login or Register to Ask a Question