awk to remove lines that do not start with digit and combine line or lines Post: 303000559

Sponsored Content

Top Forums Shell Programming and Scripting awk to remove lines that do not start with digit and combine line or lines Post 303000559 by cmccabe on Friday 14th of July 2017 08:51:07 AM

07-14-2017

Registered User

Thank you both very much.... @Scrutinizer would you mind adding a brief description of how the awk works. Thank you Smilie

.

---------- Post updated at 07:51 AM ---------- Previous update was at 07:23 AM ----------

Can the number in $1 be restrited to 5 digits? That is if there is a random number that is 3 digits in the start of the line it is removed.

Code:

all whitespace and symbols in $1 are removed
line 3 is removed because the random digit is less than a length of 5 digits

file

Code:

      81262        direct probe methodology (eg, Southern blot)
      81263    IGH@ (Immunoglobulin heavy chain locus) (eg, leukemia and lymphoma, B-cell), variable region somatic
               mutation analysis
714       l   New Code     s Revised Code       +   Add-On Code        Ꮬ Modifier -51 Exempt                  H    Telemedicine
                                                              CPT codes and descriptions only �2016 American Medical Association. All rights reserved.
                                                                                                            PATHOLOGY/ LABORATORY

desired output

Code:

81262        direct probe methodology (eg, Southern blot)
81263    IGH@ (Immunoglobulin heavy chain locus) (eg, leukemia and lymphoma, B-cell), variable region somatic
               mutation analysis

awk for leghth maybe:

Code:

awk 'NR>5{print A[NR%3]} {A[NR%3]=$0}' file | awk '{if(length($1) < 5 ) && $1!~/[^0-9]/ && { gsub(/[^[:alnum:]]/, "", $1);{if(p) print p; p=$0; next} {p=p OFS $0} END{print p}'

Thank you

Last edited by cmccabe; 07-15-2017 at 11:03 AM..

cmccabe

View Public Profile for cmccabe

Find all posts by cmccabe

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Need to remove lines that start with an IP address

Hi, I keep having to remove lines have an IP address as the second field from my awstats logs, as it makes the processing fail. Rather than do it individually each time (once or twice a week) it fails, I'd like to remove any lines from the file that have 3 digits and then a dot as the start...

2. Shell Programming and Scripting

Combine multiple lines in single line

This is related to one of my previous post but now with a slight difference: I need the "Updated:" to be in one line as well as the "Information:" on one line as well. These are in multiple lines right now as seen below. These can have 2 or more lines that needs to be in one line. System name:...

3. Shell Programming and Scripting

remove spaces and lines that start with --

Is it possible to remove empty lines between >humid-sets (bold) and also humidset that start with -- (for ex: > humid3 | () : | (+) ) Thanx in advance Note: The humid sets will be in thousands and lines will be more than 100 thousand. input > humid1 | () : | (+)...

4. Shell Programming and Scripting

Combine multiple unique lines from event log text file into one line, use PERL or AWK?

I can't decide if I should use AWK or PERL after pouring over these forums for hours today I decided I'd post something and see if I couldn't get some advice. I've got a text file full of hundreds of events in this format: Record Number : 1 Records in Seq : ...

5. Shell Programming and Scripting

Remove certain lines from file based on start of line except beginning and ending

Hi, I have multiple large files which consist of the below format: I am trying to write an awk or sed script to remove all occurrences of the 00 record except the first and remove all of the 80 records except the last one. Any help would be greatly appreciated.

6. Shell Programming and Scripting

Combine multiple lines into single line

Hi All , I have a file with below data # User@Host: xyz @ # Query_time: t1 Lock_time: t2 Rows_sent: n1 Rows_examined: n2 SET timestamp=1396852200; select count(1) from table; # Time: 140406 23:30:01 # User@Host: abc @ # Query_time: t1 Lock_time: t2 Rows_sent: n1 Rows_examined:...

7. Shell Programming and Scripting

Remove all lines which start with #

Oracle Linux 6.4/Bash I have a file like below. I want to remove all lines which start with # character. Can I do this vi editor ? If not , which other utility can I use for this ? # This is a test script CUSER=`id |cut -d"(" -f2 | cut -d ")" -f1` # Some text CDATE=`date +%y%m%d` ## get...

8. Shell Programming and Scripting

awk to combine lines from line with pattern match to a line that ends in a pattern

I am trying to combine lines with these conditions: 1. First line starts with text of "libname VALUE db2 datasrc" where VALUE can be any text. 2. If condition1 is met then continue to combine lines through a line that ends with a semicolon. 3. Ignore case when matching patterns and remove any...

9. Shell Programming and Scripting

awk to combine lines if fields match in lines

In the awk below, what I am attempting to do is check each line in the tab-delimeted input, which has ~20 lines in it, for a keyword SVTYPE=Fusion. If the keyword is found I am splitting $3 using the . (dot) and reading the portion before and after the dot in an array a. If it does have that...

10. UNIX for Beginners Questions & Answers

awk with sed to combine lines and remove specific odd # pattern from line

In the awk piped to sed below I am trying to format file by removing the odd xxxx_digits and whitespace after, then move the even xxxx_digit to the line above it and add a space between them. There may be multiple lines in file but they are in the same format. The Filename_ID line is the last line...

LEARN ABOUT REDHAT

amplot

AMPLOT(8)						      System Manager's Manual							 AMPLOT(8)

NAME

       amplot - visualize the behavior of Amanda

SYNOPSIS

       amplot [ -c ] [ -e ] [ -g ] [ -l ] [ -p ] [ -t T ] amdump_files

DESCRIPTION

       Amplot  reads  an  amdump  output file that Amanda generates each run (e.g.  amdump.1) and translates the information into a picture format
       that may be used to determine how your installation is doing and if any parameters need to be changed.  Amplot also prints out amdump lines
       that  it  either  does  not understand or knows to be warning or error lines and a summary of the start, end and total time for each backup
       image.

       Amplot is a shell script that executes an awk program (amplot.awk) to scan the amdump output file.  It  then  executes  a  gnuplot  program
       (amplot.g)  to  generate the graph.  The awk program is written in an enhanced version of awk, such as GNU awk (gawk version 2.15 or later)
       or nawk.

       During execution, amplot generates a few temporary files that gnuplot uses.  These files are deleted at the end of execution.

       See the amanda(8) man page for more details about Amanda.

OPTIONS

       -c     Compress amdump_files after plotting.

       -e     Extend the X (time) axis if needed.

       -g     Direct gnuplot output directly to the X11 display (default).

       -p     Direct postscript output to file YYYYMMDD.ps (opposite of -g).

       -l     Generate landscape oriented output.

       -t T   Set the right edge of the plot to be T hours.

       The amdump_files may be in various compressed formats (compress, gzip, pact, compact).

INTERPRETATION

       The figure is divided into a number of regions.	There are titles on the top that show important statistical information about the configu-
       ration  and  from  this execution of amdump.  In the figure, the X axis is time, with 0 being the moment amdump was started.  The Y axis is
       divided into 5 regions:

	      QUEUES: How many backups have not been started, how many are waiting on space in the holding disk and how many have been transferred
	      successfully to tape.

	      %BANDWIDTH: Percentage of allowed network bandwidth in use.

	      HOLDING DISK: The higher line depicts space allocated on the holding disk to backups in progress and completed backups waiting to be
	      written to tape.	The lower line depicts the fraction of the holding disk containing completed backups waiting to be written to tape
	      including the file currently being written to tape.  The scale is percentage of the holding disk.

	      TAPE: Tape drive usage.

	      %DUMPERS: Percentage of active dumpers.

       The idle period at the left of the graph is time amdump is asking the machines how much data they are going to dump.  This process can take
       a while if hosts are down or it takes them a long time to generate estimates.

AUTHOR

       Olafur Gudmundsson ogud@tis.com
       Trusted Information Systems
       formerly at University of Maryland, College Park

BUGS

       Reports lines it does not recognize, mainly error cases but some are legitimate lines the program needs to be taught about.

SEE ALSO

       amanda(8), amdump(8), gawk(1), nawk(1), awk(1), gnuplot(1), sh(1), compress(1), gzip(1)

4th Berkeley Distribution														 AMPLOT(8)

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Need to remove lines that start with an IP address

Discussion started by: sfisk

2. Shell Programming and Scripting

Combine multiple lines in single line

Discussion started by: The One

3. Shell Programming and Scripting

remove spaces and lines that start with --

Discussion started by: quincyjones

4. Shell Programming and Scripting

Combine multiple unique lines from event log text file into one line, use PERL or AWK?

Discussion started by: Mayday22

5. Shell Programming and Scripting

Remove certain lines from file based on start of line except beginning and ending

Discussion started by: nwalsh88

6. Shell Programming and Scripting

Combine multiple lines into single line

Discussion started by: rakesh_411

7. Shell Programming and Scripting

Remove all lines which start with #

Discussion started by: kraljic

8. Shell Programming and Scripting

awk to combine lines from line with pattern match to a line that ends in a pattern

Discussion started by: Wes Kem

9. Shell Programming and Scripting

awk to combine lines if fields match in lines

Discussion started by: cmccabe

10. UNIX for Beginners Questions & Answers

awk with sed to combine lines and remove specific odd # pattern from line

Discussion started by: cmccabe

LEARN ABOUT REDHAT

amplot