Search and Replace by record position Post: 302569846

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Add 'ENDEND' on end of each record at position is 14-20

I have file format like below and I'm trying to modify this file. I need to add 'ENDEND' end of each record. 01 ASH01 1CTCTL EDPPOO STAND 01 ASH08 0020 A1TH 101 01 ASH09 0022 A1TH 102 01 ASH09 0022 A1TH 103 01 ASH02 2CTCTL ...

2. Shell Programming and Scripting

Search for a string and replace the searched string in the same position

Hi All, My requisite is to search for the string "0108"(which is the year and has come in the wrong year format) in a particular column say 4th column in a tab delimited file and then replace it with 2008(the correct year format) in the same position where 0108 was found..The issue is the last...

3. UNIX for Dummies Questions & Answers

Search for a string and replace the searched string in the same position in samefile

4. Shell Programming and Scripting

search and replace fixed length record file

Hi I need to be search a file of fixed length records and when I hit a particular record that match a search string, substitute a known position field In the example file below FHEAD000000000120090806143011 THEAD0000000002Y0000000012 P00000000000000001234 TTAIL0000000003...

5. UNIX for Dummies Questions & Answers

Search a string in the file and then replace another string after that position

Hi I am looking for a particular string in a file.If the string exists, then I want to replace another string with some other text.Once replaced, search for the same text after that character position in the file. :wall: E.g: Actual File content: Hello Name: Nitin Raj Welcome to Unix...

6. Shell Programming and Scripting

Extract timestamp from first record in xml file and it checks if not it will replace first record

I have test.xml <emp><id>101</id><name>AAA</name><date>06/06/14 1811</date></emp> <Join><id>101</id><city>london</city><date>06/06/14 2011</date></join> <Join><id>101</id><city>new york</city><date>06/06/14 1811</date></join> <Join><id>101</id><city>sydney</city><date>06/06/14...

7. Shell Programming and Scripting

Search for a string at a particular position and replace with blank based on position

Hi, I have a file with multiple lines(fixed width dat file). I want to search for '02' in the positions 45-46 and if available, in that lines, I need to replace value in position 359 with blank. As I am new to unix, I am not able to figure out how to do this. Can you please help me to achieve...

8. Shell Programming and Scripting

Replace a string for every record after the 1st record

I have data coming in the below format for each record <?xml version="1.0" encoding="UTF-8" standalone="no"?><test_sox xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"><testdetials>....</test_sox> <?xml version="1.0" encoding="UTF-8" standalone="no"?><test_sox...

9. Post Here to Contact Site Administrators and Moderators

Search for a pattern and replace a space at specific position with a Character in File

In file, we have millions of records each of 1000 in length. And at specific position say 800 there is a space, we need to replace it with Character X if the ID in that row starts with 123. So far i have used the below which is replacing space at that position to X but its not checking for...

10. UNIX for Beginners Questions & Answers

Shift record from one position to another

Hi All, I have a file and it is a fixed length file. I want to move the values from 42,6 ( where 6 is length) to the 36th position Original file: 00000100000100000100000100000100001 000870 ...

LEARN ABOUT LINUX

binary

binary(3erl)						     Erlang Module Definition						      binary(3erl)

NAME

       binary - Library for handling binary data

DESCRIPTION

       This  module contains functions for manipulating byte-oriented binaries. Although the majority of functions could be implemented using bit-
       syntax, the functions in this library are highly optimized and are expected to either execute faster or consume less memory (or both)  than
       a counterpart written in pure Erlang.

       The module is implemented according to the EEP (Erlang Enhancement Proposal) 31.

   Note:
       The  library  handles  byte-oriented data. Bitstrings that are not binaries (does not contain whole octets of bits) will result in a badarg
       exception being thrown from any of the functions in this module.

DATA TYPES

	   cp()
	    - Opaque data-type representing a compiled search-pattern. Guaranteed to be a tuple()
	      to allow programs to distinguish it from non precompiled search patterns.

	   part() = {Start,Length}
	   Start = int()
	   Length = int()
	     - A representaion of a part (or range) in a binary. Start is a
	       zero-based offset into a binary() and Length is the length of
	       that part. As input to functions in this module, a reverse
	       part specification is allowed, constructed with a negative
	       Length, so that the part of the binary begins at Start +
	       Length and is -Length long. This is useful for referencing the
	       last N bytes of a binary as {size(Binary), -N}. The functions
	       in this module always return part()'s with positive Length.

EXPORTS

       at(Subject, Pos) -> int()

	      Types  Subject = binary()
		     Pos = int() >= 0

	      Returns the byte at position Pos (zero-based) in the binary Subject as an integer. If Pos >= byte_size(Subject) , a badarg exception
	      is raised.

       bin_to_list(Subject) -> list()

	      Types  Subject = binary()

	      The same as bin_to_list(Subject,{0,byte_size(Subject)}) .

       bin_to_list(Subject, PosLen) -> list()

	      Types  Subject = binary()
		     PosLen = part()

	      Converts	Subject  to  a	list of int() s, each representing the value of one byte. The part() denotes which part of the binary() to
	      convert. Example:

	      1> binary:bin_to_list(<<"erlang">>,{1,3}).
	      "rla"
	      %% or [114,108,97] in list notation.

	      If PosLen in any way references outside the binary, a badarg exception is raised.

       bin_to_list(Subject, Pos, Len) -> list()

	      Types  Subject = binary()
		     Pos = int()
		     Len = int()

	      The same as bin_to_list(Subject,{Pos,Len}) .

       compile_pattern(Pattern) -> cp()

	      Types  Pattern = binary() | [ binary() ]

	      Builds an internal structure representing a compilation of a search-pattern, later to be used in the match/3 , matches/3	,  split/3
	      or  replace/4  functions.  The cp() returned is guaranteed to be a tuple() to allow programs to distinguish it from non pre-compiled
	      search patterns

	      When a list of binaries is given, it denotes a set of alternative binaries  to  search  for.  I.e  if  [<<"functional">>,<<"program-
	      ming">>]	is  given  as  Pattern , this means "either <<"functional">> or <<"programming">> ". The pattern is a set of alternatives;
	      when only a single binary is given, the set has only one element. The order of alternatives in a pattern is not significant.

	      The list of binaries used for search alternatives shall be flat and proper.

	      If Pattern is not a binary or a flat proper list of binaries with length > 0, a badarg exception will be raised.

       copy(Subject) -> binary()

	      Types  Subject = binary()

	      The same as copy(Subject, 1) .

       copy(Subject,N) -> binary()

	      Types  Subject = binary()
		     N = int() >= 0

	      Creates a binary with the content of Subject duplicated N times.

	      This function will always create a new binary, even if N = 1 . By using copy/1 on a binary referencing a larger  binary,	one  might
	      free up the larger binary for garbage collection.

   Note:
       By  deliberately copying a single binary to avoid referencing a larger binary, one might, instead of freeing up the larger binary for later
       garbage collection, create much more binary data than needed. Sharing binary data is usually good. Only in special cases, when small  parts
       reference large binaries and the large binaries are no longer used in any process, deliberate copying might be a good idea.

       If N < 0 , a badarg exception is raised.

       decode_unsigned(Subject) -> Unsigned

	      Types  Subject = binary()
		     Unsigned = int() >= 0

	      The same as decode_unsigned(Subject,big) .

       decode_unsigned(Subject, Endianess) -> Unsigned

	      Types  Subject = binary()
		     Endianess = big | little
		     Unsigned = int() >= 0

	      Converts the binary digit representation, in big or little endian, of a positive integer in Subject to an Erlang int() .

	      Example:

	      1> binary:decode_unsigned(<<169,138,199>>,big).
	      11111111

       encode_unsigned(Unsigned) -> binary()

	      Types  Unsigned = int() >= 0

	      The same as encode_unsigned(Unsigned,big) .

       encode_unsigned(Unsigned,Endianess) -> binary()

	      Types  Unsigned = int() >= 0
		     Endianess = big | little

	      Converts a positive integer to the smallest possible representation in a binary digit representation, either big or little endian.

	      Example:

	      1> binary:encode_unsigned(11111111,big).
	      <<169,138,199>>

       first(Subject) -> int()

	      Types  Subject = binary()

	      Returns the first byte of the binary Subject as an integer. If the size of Subject is zero, a badarg exception is raised.

       last(Subject) -> int()

	      Types  Subject = binary()

	      Returns the last byte of the binary Subject as an integer. If the size of Subject is zero, a badarg exception is raised.

       list_to_bin(ByteList) -> binary()

	      Types  ByteList = iodata() (see module erlang)

	      Works exactly as erlang:list_to_binary/1 , added for completeness.

       longest_common_prefix(Binaries) -> int()

	      Types  Binaries = [ binary() ]

	      Returns the length of the longest common prefix of the binaries in the list Binaries . Example:

	      1> binary:longest_common_prefix([<<"erlang">>,<<"ergonomy">>]).
	      2
	      2> binary:longest_common_prefix([<<"erlang">>,<<"perl">>]).
	      0

	      If Binaries is not a flat list of binaries, a badarg exception is raised.

       longest_common_suffix(Binaries) -> int()

	      Types  Binaries = [ binary() ]

	      Returns the length of the longest common suffix of the binaries in the list Binaries . Example:

	      1> binary:longest_common_suffix([<<"erlang">>,<<"fang">>]).
	      3
	      2> binary:longest_common_suffix([<<"erlang">>,<<"perl">>]).
	      0

	      If Binaries is not a flat list of binaries, a badarg exception is raised.

       match(Subject, Pattern) -> Found | nomatch

	      Types  Subject = binary()
		     Pattern = binary() | [ binary() ] | cp()
		     Found = part()

	      The same as match(Subject, Pattern, []) .

       match(Subject,Pattern,Options) -> Found | nomatch

	      Types  Subject = binary()
		     Pattern = binary() | [ binary() ] | cp()
		     Found = part()
		     Options = [ Option ]
		     Option = {scope, part()}

	      Searches for the first occurrence of Pattern in Subject and returns the position and length.

	      The function will return {Pos,Length} for the binary in Pattern starting at the lowest position in Subject , Example:

	      1> binary:match(<<"abcde">>, [<<"bcde">>,<<"cd">>],[]).
	      {1,4}

	      Even  though  <<"cd">> ends before <<"bcde">> , <<"bcde">> begins first and is therefore the first match. If two overlapping matches
	      begin at the same position, the longest is returned.

	      Summary of the options:

		{scope, {Start, Length}} :
		  Only the given part is searched. Return values still have offsets from the beginning of Subject . A negative Length  is  allowed
		  as described in the TYPES section of this manual.

	      If none of the strings in Pattern is found, the atom nomatch is returned.

	      For a description of Pattern , see compile_pattern/1 .

	      If  {scope, {Start,Length}} is given in the options such that Start is larger than the size of Subject , Start + Length is less than
	      zero or Start + Length is larger than the size of Subject , a badarg exception is raised.

       matches(Subject, Pattern) -> Found

	      Types  Subject = binary()
		     Pattern = binary() | [ binary() ] | cp()
		     Found = [ part() ] | []

	      The same as matches(Subject, Pattern, []) .

       matches(Subject,Pattern,Options) -> Found

	      Types  Subject = binary()
		     Pattern = binary() | [ binary() ] | cp()
		     Found = [ part() ] | []
		     Options = [ Option ]
		     Option = {scope, part()}

	      Works like match, but the Subject is searched until exhausted and a list of all non-overlapping parts matching Pattern  is  returned
	      (in order).

	      The first and longest match is preferred to a shorter, which is illustrated by the following example:

	      1> binary:matches(<<"abcde">>,
				[<<"bcde">>,<<"bc">>>,<<"de">>],[]).
	      [{1,4}]

	      The  result  shows  that	<<bcde">>  is  selected  instead  of  the shorter match <<"bc">> (which would have given raise to one more
	      match,<<"de">>). This corresponds to the behavior of posix regular expressions (and programs like awk), but is not  consistent  with
	      alternative matches in re (and Perl), where instead lexical ordering in the search pattern selects which string matches.

	      If none of the strings in pattern is found, an empty list is returned.

	      For a description of Pattern , see compile_pattern/1 and for a description of available options, see match/3 .

	      If  {scope, {Start,Length}} is given in the options such that Start is larger than the size of Subject , Start + Length is less than
	      zero or Start + Length is larger than the size of Subject , a badarg exception is raised.

       part(Subject, PosLen) -> binary()

	      Types  Subject = binary()
		     PosLen = part()

	      Extracts the part of the binary Subject described by PosLen .

	      Negative length can be used to extract bytes at the end of a binary:

	      1> Bin = <<1,2,3,4,5,6,7,8,9,10>>.
	      2> binary:part(Bin,{byte_size(Bin), -5)).
	      <<6,7,8,9,10>>

   Note:
       part/2 and part/3 are also available in the erlang module under the names binary_part/2 and binary_part/3 . Those BIFs are allowed in guard
       tests.

       If PosLen in any way references outside the binary, a badarg exception is raised.

       part(Subject, Pos, Len) -> binary()

	      Types  Subject = binary()
		     Pos = int()
		     Len = int()

	      The same as part(Subject, {Pos, Len}) .

       referenced_byte_size(binary()) -> int()

	      If  a binary references a larger binary (often described as being a sub-binary), it can be useful to get the size of the actual ref-
	      erenced binary. This function can be used in a program to trigger the use of copy/1 . By copying a binary, one might dereference the
	      original, possibly large, binary which a smaller binary is a reference to.

	      Example:

	      store(Binary, GBSet) ->
		NewBin =
		    case binary:referenced_byte_size(Binary) of
			Large when Large > 2 * byte_size(Binary) ->
			   binary:copy(Binary);
			_ ->
			   Binary
		    end,
		gb_sets:insert(NewBin,GBSet).

	      In  this	example, we chose to copy the binary content before inserting it in the gb_set() if it references a binary more than twice
	      the size of the data we're going to keep. Of course different rules for when copying will apply to different programs.

	      Binary sharing will occur whenever binaries are taken apart, this is the fundamental reason why binaries are fast, decomposition can
	      always be done with O(1) complexity. In rare circumstances this data sharing is however undesirable, why this function together with
	      copy/1 might be useful when optimizing for memory use.

	      Example of binary sharing:

	      1> A = binary:copy(<<1>>,100).
	      <<1,1,1,1,1 ...
	      2> byte_size(A).
	      100
	      3> binary:referenced_byte_size(A)
	      100
	      4> <<_:10/binary,B:10/binary,_/binary>> = A.
	      <<1,1,1,1,1 ...
	      5> byte_size(B).
	      10
	      6> binary:referenced_byte_size(B)
	      100

   Note:
       Binary data is shared among processes. If another process still references the larger binary, copying the part this process uses only  con-
       sumes  more  memory  and will not free up the larger binary for garbage collection. Use this kind of intrusive functions with extreme care,
       and only if a real problem is detected.

       replace(Subject,Pattern,Replacement) -> Result

	      Types  Subject = binary()
		     Pattern = binary() | [ binary() ] | cp()
		     Replacement = binary()
		     Result = binary()

	      The same as replace(Subject,Pattern,Replacement,[]) .

       replace(Subject,Pattern,Replacement,Options) -> Result

	      Types  Subject = binary()
		     Pattern = binary() | [ binary() ] | cp()
		     Replacement = binary()
		     Result = binary()
		     Options = [ Option ]
		     Option = global | {scope, part()} | {insert_replaced, InsPos}
		     InsPos = OnePos | [ OnePos ]
		     OnePos = int() =< byte_size(Replacement)

	      Constructs a new binary by replacing the parts in Subject matching Pattern with the content of Replacement .

	      If the matching sub-part of Subject giving raise to the replacement is to be inserted in the result,  the  option  {insert_replaced,
	      InsPos}  will  insert  the matching part into Replacement at the given position (or positions) before actually inserting Replacement
	      into the Subject . Example:

	      1> binary:replace(<<"abcde">>,<<"b">>,<<"[]">>,[{insert_replaced,1}]).
	      <<"a[b]cde">>
	      2> binary:replace(<<"abcde">>,[<<"b">>,<<"d">>],<<"[]">>,
			       [global,{insert_replaced,1}]).
	      <<"a[b]c[d]e">>
	      3> binary:replace(<<"abcde">>,[<<"b">>,<<"d">>],<<"[]">>,
			       [global,{insert_replaced,[1,1]}]).
	      <<"a[bb]c[dd]e">>
	      4> binary:replace(<<"abcde">>,[<<"b">>,<<"d">>],<<"[-]">>,
			       [global,{insert_replaced,[1,2]}]).
	      <<"a[b-b]c[d-d]e">>

	      If any position given in InsPos is greater than the size of the replacement binary, a badarg exception is raised.

	      The options global and {scope, part()} work as for split/3 . The return type is always a binary() .

	      For a description of Pattern , see compile_pattern/1 .

       split(Subject,Pattern) -> Parts

	      Types  Subject = binary()
		     Pattern = binary() | [ binary() ] | cp()
		     Parts = [ binary() ]

	      The same as split(Subject, Pattern, []) .

       split(Subject,Pattern,Options) -> Parts

	      Types  Subject = binary()
		     Pattern = binary() | [ binary() ] | cp()
		     Parts = [ binary() ]
		     Options = [ Option ]
		     Option = {scope, part()} | trim | global

	      Splits Binary into a list of binaries based on Pattern. If the option global is not given, only the first occurrence of  Pattern	in
	      Subject will give rise to a split.

	      The parts of Pattern actually found in Subject are not included in the result.

	      Example:

	      1> binary:split(<<1,255,4,0,0,0,2,3>>, [<<0,0,0>>,<<2>>],[]).
	      [<<1,255,4>>, <<2,3>>]
	      2> binary:split(<<0,1,0,0,4,255,255,9>>, [<<0,0>>, <<255,255>>],[global]).
	      [<<0,1>>,<<4>>,<<9>>]

	      Summary of options:

		{scope, part()} :
		  Works  as  in  match/3 and matches/3 . Note that this only defines the scope of the search for matching strings, it does not cut
		  the binary before splitting. The bytes before and after the scope will be kept in the result. See example below.

		trim :
		  Removes trailing empty parts of the result (as does trim in re:split/3 )

		global :
		  Repeats the split until the Subject is exhausted. Conceptually the global option makes split work on the positions  returned	by
		  matches/3 , while it normally works on the position returned by match/3 .

	      Example of the difference between a scope and taking the binary apart before splitting:

	      1> binary:split(<<"banana">>,[<<"a">>],[{scope,{2,3}}]).
	      [<<"ban">>,<<"na">>]
	      2> binary:split(binary:part(<<"banana">>,{2,3}),[<<"a">>],[]).
	      [<<"n">>,<<"n">>]

	      The  return type is always a list of binaries that are all referencing Subject . This means that the data in Subject is not actually
	      copied to new binaries and that Subject cannot be garbage collected until the results of the split are no longer referenced.

	      For a description of Pattern , see compile_pattern/1 .

Ericsson AB							   stdlib 1.17.3						      binary(3erl)