Sponsored Content
Top Forums Shell Programming and Scripting To extract certain columnns with header Post 302989349 by RudiC on Wednesday 11th of January 2017 09:04:50 AM
Old 01-11-2017
The data seem to be from an (old?) FORTRAN print file with carriage control characters as the first character in a line being interpreted / suppressed at printout. In principle, they should be eliminated in processing as well.
Picking lines by your criterion "single space" would not suppress the dash lines, so I use the 1 to synchronize to the page breaks and then suppress the following 5 resp. 7 lines. You can select any number of columns by adding their headers to the COLUMNS variable, separated by commas. Try
Code:
awk '
/^1/            {SKIP = NR + 5 + HDFND
                }
NR < SKIP       {next
                }
!HDFND          {MX = split (COLUMNS, HD, ",")
                 for (i=1; i<=MX; i++)  {match ($0, HD[i] " *")
                                         P[i] = RSTART
                                         L[i] = RLENGTH
                                        }
                 HDFND = 2
                }
                {for (i=1; i<=MX; i++)  printf "%s ", substr ($0, P[i], L[i])
                 printf RS
                }
' COLUMNS=" ORD NO, P A R T  N U M B E R,INV NO / SER NO" file
 ORD NO     P A R T  N U M B E R     INV NO / SER NO  
---------  ------------------------  ---------------  
490117701  PEF0AM1MX2MX40MM          8916             
490118901  3M7447                    SM0883           
490126001  SAFETYC0NE30IN            1412008          
490105304  C0TT0NRAG                 13264            
490121901  ARINUS0940                D01100477        
421751301  8W550C3                   1557             
421755201  BR127                     3027490          
421752701  SIKKENSN0NSLIPGREY        15668            
490127301  WHEELCAST0R8X2SWIVEL      111931           
490127302  SELFTAPINGSCREW112IN      111931           
490126701  BRUSHBAMB002INSX6STICK    111896           
421766901  EP0CAST1619AB             3029548          
421766902  PR1422B2                  3029548          
421750101  DUNL0PLP                  D01341MY1        
421761201  SKDS2                     18036089         
421756401  ASG223KIL0                30854            
421724801  PS870A2                   QA213552014

 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

How to extract duplicate records with associated header record

All, I have a task to search through several hundred files and extract duplicate detail records and keep them grouped with their header record. If no duplicate detail record exists, don't pull the header. For example, an input file could look like this: input.txt HA D1 D2 D2 D3 D4 D4... (17 Replies)
Discussion started by: run_eim
17 Replies

2. Shell Programming and Scripting

Extract date from file header and prefix it to all lines

Hello All, I have a file in the following format. I want to extract the date(020090930, 020090929) in the string "STPAGE020090930" and "STPAGE020090929" and prefix it to all lines below them. The output must be put into a new file. STPAGE020090930 xyzz aalc... (3 Replies)
Discussion started by: john2022
3 Replies

3. Shell Programming and Scripting

Extract header from top command output

hi, I want to extract and save the cpu(s) information from top command output, but individual cpu statistics separately on a multi-processor machine. In command line, top will show this statistics when we press the switch "1". any ideas? thanks, meharo (3 Replies)
Discussion started by: meharo
3 Replies

4. Shell Programming and Scripting

Extract specific content from data and rename its header problem asking

Input file 1: >pattern_5 GAATTCGTTCATGTAGGTTGASDASFGDSGRTYRYGHDGSDFGSDGGDSGSDGSDFGSDF ATTTAATTATGATTCATACGTCATATGTTATTATTCAATCGTATAAAATTATGTGACCTT SDFSDGSDFKSDAFLKJASLFJASKLFSJAKJFHASJKFHASJKFHASJKFHSJAKFHAW >pattern_1 AAGTCTTAAGATATCACCGTCGATTAGGTTTATACAGCTTTTGTGTTATTTAAATTTGAC... (10 Replies)
Discussion started by: patrick87
10 Replies

5. Shell Programming and Scripting

Using AWK BEGIN to extract file header info into variables

Hi Folks, I've searched for this for quite a while, but can't find any solution - hope someone can help. I have various files with standard headers. eg. <HEADER> IP: 1.2.3.4 Username: Joe Time: 12:00:00 Date: 23/05/2010 </HEADER> This is a test and this part can be any size... (6 Replies)
Discussion started by: damoske
6 Replies

6. Shell Programming and Scripting

Extract columns where header matches a given string

Hi, I'm having trouble pulling out columns where the headers match a file of key ID's I'm interested in and was looking for some help. file1.txt I Name 34 56 84 350 790 1215 1919 7606 9420 file2.txt I Name 1 1 2 2 3 3 ... 34 34... 56 56... 84 84... 350 350... M 1 A A A A... (20 Replies)
Discussion started by: flotsam
20 Replies

7. UNIX for Dummies Questions & Answers

Merge all csv files in one folder considering only 1 header row and ignoring header of all others

Friends, I need help with the following in UNIX. Merge all csv files in one folder considering only 1 header row and ignoring header of all other files. FYI - All files are in same format and contains same headers. Thank you (4 Replies)
Discussion started by: Shiny_Roy
4 Replies

8. Shell Programming and Scripting

Extract columns based on header

Hi to all, I have two files. File1 has no header, two columns: sample1 A sample2 B sample3 B sample4 C sample5 A sample6 D sample7 D File2 has a header, except for the first 3 columns (chr,start,end). "sample1" is the header for the 4th ,5th ,6th columns, "sample2" is the header... (4 Replies)
Discussion started by: aec
4 Replies

9. UNIX for Beginners Questions & Answers

awk script to extract a column, replace one of the header and replace year(from ddmmyy to yyyy)

I have a csv which has lot of columns . I was looking for an awk script which would extract a column twice. for the first occurance the header and data needs to be intact but for the second occurance i want to replace the header name since it a duplicate and extract year value which is in ddmmyy... (10 Replies)
Discussion started by: Kunalcurious
10 Replies

10. Shell Programming and Scripting

Find header in a text file and prepend it to all lines until another header is found

I've been struggling with this one for quite a while and cannot seem to find a solution for this find/replace scenario. Perhaps I'm getting rusty. I have a file that contains a number of metrics (exactly 3 fields per line) from a few appliances that are collected in parallel. To identify the... (3 Replies)
Discussion started by: verdepollo
3 Replies
tracker-extract(1)						   User Commands						tracker-extract(1)

NAME
tracker-extract - Extract metadata from a file. SYNOPSYS
tracker-extract [OPTION...] FILE... DESCRIPTION
tracker-extract reads the file and mimetype provided in stdin and extract the metadata from this file; then it displays the metadata on the standard output. NOTE: If a FILE is not provided then tracker-extract will run for 30 seconds waiting for DBus calls before quitting. OPTIONS
-?, --help Show summary of options. -v, --verbosity=N Set verbosity to N. This overrides the config value. Values include 0=errors, 1=minimal, 2=detailed and 3=debug. -f, --file=FILE The FILE to extract metadata from. The FILE argument can be either a local path or a URI. It also does not have to be an absolute path. -m, --mime=MIME The MIME type to use for the file. If one is not provided, it will be guessed automatically. -d, --disable-shutdown Disable shutting down after 30 seconds of inactivity. -i, --force-internal-extractors Use this option to force internal extractors over 3rd parties like libstreamanalyzer. -m, --force-module=MODULE Force a particular module to be used. This is here as a convenience for developers wanting to test their MODULE file. Only the MOD- ULE name has to be specified, not the full path. Typically, a MODULE is installed to /usr/lib/tracker-0.7/extract-modules/. This option can be used with or without the .so part of the name too, for example, you can use --force-module=foo Modules are shared objects which are dynamically loaded at run time. These files must have the .so suffix to be loaded and must con- tain the correct symbols to be authenticated by tracker-extract. For more information see the libtracker-extract reference documen- tation. -V, --version Show binary version. EXAMPLES
Using command line to extract metadata from a file: $ tracker-extract -v 3 -f /path/to/some/file.mp3 Using a specific module to extract metadata from a file: $ tracker-extract -v 3 -f /path/to/some/file.mp3 -m mymodule ENVIRONMENT
TRACKER_EXTRACTORS_DIR This is the directory which tracker uses to load the shared libraries from (used for extracting metadata for specific file types). These are needed on each invocation of tracker-store. If unset it will default to the correct place. This is used mainly for testing purposes. FILES
$HOME/.config/tracker/tracker-extract.cfg SEE ALSO
tracker-store(1), tracker-sparql(1), tracker-stats(1), tracker-info(1). tracker-extract.cfg(5). /usr/lib/tracker-0.7/extract-modules/ GNU
July 2007 tracker-extract(1)
All times are GMT -4. The time now is 05:58 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy