awk/sed for parsing file Post: 302262011

Sponsored Content

Top Forums Shell Programming and Scripting awk/sed for parsing file Post 302262011 by radoulov on Wednesday 26th of November 2008 07:16:27 AM

11-26-2008

Registered User

Quote:

Originally Posted by subin_bala

And can u pls expain simply how code is working ??

Yes.
It's an AWK script:

Code:

awk '...' var1=value [var2=value ... varn=value] inputfile(s)

var=value assigns a value to a variable var accessible inside the AWK code.
So id=123 is the desired id to be passed to the program.

Following the code logic we have:

1. Construct the logical record (r) by concatenating all the records seen so far:

record r -> if record is not empty: r ? -> add a record separator (RS, newline by default) and the current record ($0): r RS $0, else (record is empty, it's the first access -> assign the value of the current record: : $0
This is the meaning of the following expression:

Code:

{ r = r ? r RS $0 : $0 }

2. Check if the current record matches the pattern "cm:" followed by the value of the variable id (see above): $0 ~ "cm:" id. If the test returns true, auto increment the value of the variable f (f for flag, marker): { f++ }.

Code:

$0 ~ "cm:" id { f++ }

3. If the current record does not match the pattern ^[\t ] : the line does not begin with a blank character (tab or space), these are your E, D etc records, do the following:
- check if the value of the variable f in Boolean context returns true (is not an empty string or has a numeric value 0): if it's true (not 0, see 2. above), this logical record contains our id, so we print it: print r.
- reset the r and the f variables, we will initialize them after if needed.

Code:

!/^[\t ]/ {
  if (f) print r
  r = f = 0
  }

4. After reading the entire input check if we have something to print.
This is because of the build (r) -> set (f) -> check after (!/^[\t ]/) logic:
we print the previous when we reach the current. So without the END block we may miss the last one.

Code:

END {
  if (f) print r
  }

Hope this helps.

radoulov

View Public Profile for radoulov

Find all posts by radoulov

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

awk sed parsing

hi , i would like to parse some file with the fallowing data : data data data "unwanted data" data data "unwanted data" data data data data #unwanted data. what i want it to have any coments between "" and after # to be erased using awk or/and sed. has anyone an idea? thanks.

2. Shell Programming and Scripting

parsing xml with awk/sed

Hi people!, I need extract from the file (test-file.txt) the values between <context> and </context> tag's , the total are 7 lines,but i can only get 5 or 2 lines!!:confused: Please look my code: #awk '/context/{flag=1} /\/context/{flag=0} !/context/{ if (flag==1) p rint $0; }'...

3. Shell Programming and Scripting

Parsing a file (sed/awk?)

Hello people, newbie question. I'm trying to parse these type of file 1 "CAR " " C1 " " " 6 0 C1 2 "CAR " " O1A" " " 8 0 O1A 3 "CAR " " O1B" " " 8 -1 O1B 4 "CAR " " C2 " " " 6 0 C2 5 "CAR " " C3 " " " 6 ...

4. Shell Programming and Scripting

String parsing with awk/sed/?

If I have a string that has some name followed by an ID#(ex.B123456) followed by some more #'s and/or letters, would it be possible to just grab the ID portion of this string? If so how? I am pretty new with these text tools so any help is appreciated. Example: "Name_One-B123456A-12348A"

5. Shell Programming and Scripting

Line Parsing using sed and awk

Hi Guys, I need help with processing data in a file, line by line. My file test.txt has X_Building_X5946/X0 BUT/U_msp/RdBuMon_d2_B_00 BUT/U_msp/FfRmDaMix_d2_Pi3 Test_Long xp=849.416 yp=245.82 xn=849.488 yn=245.82 w=0.476 l=0.072 fault_layer="Al_T01_Mod" $ $X=849416 $Y=245582...

6. Shell Programming and Scripting

Another parsing line awk or sed problem

Hi, After looking on different forums, I'm still in trouble to parse a parameters line received in KSH. $* is equal to "/AAA:111 /BBB:222 /CCC:333 /DDD:444" I would like to parse it and be able to access anyone from his name in my KSH after. like echo myArray => display 111 ...

7. Shell Programming and Scripting

Parsing with awk or sed

I want to delete corrupt records from a file through awk or sed. Can anyone help me with this Thanks Striker Change subject to a descriptive one, ty.

8. UNIX for Advanced & Expert Users

Parsing through a file with awk/sed

I don't necessary have a problem, as I have a solution. It is just that there may be a better solution. GOAL: Part one: Parse data from a file using the "\" as a delimiter and extracting only the last delimiter. Part two: Parse same file and extract everything but the last delimited item. ...

9. Shell Programming and Scripting

awk/sed line parsing

I'm new to shell programming, but I think I learn best by following an example. I'm trying to cook up an awk/sed script, but I obviously lack the required syntax skills to achieve it. The output that I get from running my ksh script looks like this: I need to search each numbered line for...

10. UNIX for Advanced & Expert Users

Interesting awk/Perl/sed parsing challenge

I have a log with entries like: out/target/product/imx53_smd/obj/STATIC_LIBRARIES/libwebcore_intermediates/Source/WebCore/bindings/V8HTMLVideoElement.cpp : target thumb C++: libwebcore <=...

LEARN ABOUT DEBIAN

bup-margin

bup-margin(1)						      General Commands Manual						     bup-margin(1)

NAME

       bup-margin - figure out your deduplication safety margin

SYNOPSIS

       bup margin [options...]

DESCRIPTION

       bup margin  iterates  through  all  objects  in	your  bup repository, calculating the largest number of prefix bits shared between any two
       entries.  This number, n, identifies the longest subset of SHA-1 you could use and still encounter a collision between your object ids.

       For example, one system that was tested had a collection of 11 million objects (70 GB), and bup margin returned 45.  That  means  a  46-bit
       hash  would be sufficient to avoid all collisions among that set of objects; each object in that repository could be uniquely identified by
       its first 46 bits.

       The number of bits needed seems to increase by about 1 or 2 for every doubling of the number of objects.  Since SHA-1 hashes have 160 bits,
       that  leaves 115 bits of margin.  Of course, because SHA-1 hashes are essentially random, it's theoretically possible to use many more bits
       with far fewer objects.

       If you're paranoid about the possibility of SHA-1 collisions, you can monitor your repository by running bup margin occasionally to see	if
       you're getting dangerously close to 160 bits.

OPTIONS

       --predict
	      Guess  the offset into each index file where a particular object will appear, and report the maximum deviation of the correct answer
	      from the guess.  This is potentially useful for tuning an interpolation search algorithm.

       --ignore-midx
	      don't use .midx files, use only .idx files.  This is only really useful when used with --predict.

EXAMPLE

	      $ bup margin
	      Reading indexes: 100.00% (1612581/1612581), done.
	      40
	      40 matching prefix bits
	      1.94 bits per doubling
	      120 bits (61.86 doublings) remaining
	      4.19338e+18 times larger is possible

	      Everyone on earth could have 625878182 data sets
	      like yours, all in one repository, and we would
	      expect 1 object collision.

	      $ bup margin --predict
	      PackIdxList: using 1 index.
	      Reading indexes: 100.00% (1612581/1612581), done.
	      915 of 1612581 (0.057%)

SEE ALSO

       bup-midx(1), bup-save(1)

BUP

       Part of the bup(1) suite.

AUTHORS

       Avery Pennarun <apenwarr@gmail.com>.

Bup unknown-															     bup-margin(1)

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

awk sed parsing

Discussion started by: Darsh

2. Shell Programming and Scripting

parsing xml with awk/sed

Discussion started by: ricgamch

3. Shell Programming and Scripting

Parsing a file (sed/awk?)

Discussion started by: aristegui

4. Shell Programming and Scripting

String parsing with awk/sed/?

Discussion started by: airon23bball

5. Shell Programming and Scripting

Line Parsing using sed and awk

Discussion started by: naveen@

6. Shell Programming and Scripting

Another parsing line awk or sed problem

Discussion started by: RickTrader

7. Shell Programming and Scripting

Parsing with awk or sed

Discussion started by: Rahul_us

8. UNIX for Advanced & Expert Users

Parsing through a file with awk/sed

Discussion started by: OrangeYaGlad

9. Shell Programming and Scripting

awk/sed line parsing

Discussion started by: iskatel

10. UNIX for Advanced & Expert Users

Interesting awk/Perl/sed parsing challenge

Discussion started by: glev2005

LEARN ABOUT DEBIAN

bup-margin