using sed to get rid of duplicated columns... Post: 302184142

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

remove duplicated columns

hi all, i have a file contain multicolumns, this file is sorted by col2 and col3. i want to remove the duplicated columns if the col2 and col3 are the same in another line. example fileA AA BB CC DD CC XX CC DD BB CC ZZ FF DD FF HH HH the output is AA BB CC DD BB CC ZZ FF...

2. Shell Programming and Scripting

Help removing lines with duplicated columns

Hi Guys... Please Could you help me with the following ? aaaa bbbb cccc sdsd aaaa bbbb cccc qwer as you can see, the 2 lines are matched in three fields... how can I delete this pupicate ? I mean to delete the second one if 3 fields were duplicated ? Thanks

3. Shell Programming and Scripting

How to get rid of double quote in sed.

Hi, i am using sed command to grep just a valuable data for my report generating. Thanks to the person who assists me on before thread. the problem that i encounter now is when i executed below command The output will give me like below output in between the data, there is a double quote. How...

4. Shell Programming and Scripting

get rid of xml comment by grep or sed

Hi, I would like to get rid of all comment in an xml file by grep or sed command: The content seem like this:  Anyone can help? Thanks and Regards

5. UNIX for Dummies Questions & Answers

Getting rid of selected columns

Hi All, I've got a file like this: a 1 0 0 0 1 0 0 1 1 3 3 1 4 4 4 b 1 0 0 0 1 4 4 1 3 1 1 4 4 2 2 c 1 0 0 0 2 0 0 3 3 1 3 1 1 2 4 d 1 0 0 0 2 0 0 1 1 0 0 4 4 2 4 The file has ~4200 entries. I need to exclude those columns that are zeros for all those rows that have 2 in column 6. For...

6. Shell Programming and Scripting

Manipulate columns using sed

Hello, I would like to remove the first column of lines beginning by a character (in my case is an open square bracket) and finishing by a space (or any other delimiter). For example: string1 string2 string3 to string2 string3 I found this previous topic: ...

7. UNIX for Dummies Questions & Answers

Find duplicated values in two columns out of three

hi! could u help in the following? I have the data (long list!) that looks like (three coumns white space separated): rs3094315 0.0665173 742429 rs12562034 0.0738998 758311 rs3934834 0.396449 995669 rs9442372 0.402693 1008567 rs3737728 0.406271 1011278 rs6687776 0.435429 1020428 rs9651273...

8. Shell Programming and Scripting

sed to get rid of unwanted characters

so i have strings such as this: 'postfix/local#2,5#|CRON.*12062.*root.*CMD#2,5#|roice.*NQN1#1,2#|toysprc#1,4#' i need to get rid of the "#" and the numbers between them for each of the strings above. so the desired output should be: ...

9. UNIX for Dummies Questions & Answers

sed for all columns

Hi, I would like to know how can I use sed in all columns of a file tab separated. Example of input file: 0/0:0:1,0,0 0/2:0:0,2,0 Desired output file: 1,0 0,2

10. Shell Programming and Scripting

Deleting duplicated chunks in a file using awk/sed

Hi all, I'd always appreciate all helps from this site. I would like to delete duplicated chunks of strings on the same row(?). One chunk is comprised of four lines such as: path name starting point ending point voltage number I would like to delete duplicated chunks on the same...

LEARN ABOUT OSF1

sed

sed(1)							      General Commands Manual							    sed(1)

NAME

       sed - Stream editor

SYNOPSIS

       sed [-n] script [file...]

       sed [-n] [-e script]... [-f script_file...]... [file...]

       The  sed  utility is a stream editor that reads one or more text files, makes editing changes according to a script of editing commands and
       writes the results to standard output.

STANDARDS

       Interfaces documented on this reference page conform to industry standards as follows:

       sed:  XCU5.0

       Refer to the standards(5) reference page for more information about industry standards and associated tags.

OPTIONS

       Add the editing commands specified by the string script to the end of the script of editing commands.  If you are using just one -e  option
       and  no -f option, you can omit the -e option and include the single script on the command line as an argument to sed.  Uses script_file as
       the source of the edit script.  The script_file is a set of editing commands to be applied to file.  Suppresses	all  information  normally
       written to standard output.

       The order of presentation of the -e and -f options is significant.

OPERANDS

       Use  the string script as an editing script. See the description of the -e option.  The path name of a file to be edited.  If multiple file
       operands are specified, all files are read and concatenated before editing begins.

	      If no file file operand is specified, standard input is read.

DESCRIPTION

       The sed command includes many features for selecting lines to be modified and making changes only to the selected lines.

       The sed command uses two work spaces for holding the line being modified:  the pattern space, where the selected line is held, and the hold
       space, where a line can be stored temporarily.

       An  edit  script  consists  of  individual  subcommands,  each  one on a separate line.	The general form of sed subcommands is as follows:
       [address_range] function [modifier ...]

       The sed command processes each input file by reading an input line into a pattern space,  sequentially  applying  all  sed  subcommands	in
       sequence  whose addresses select that line, and writing the pattern space to standard output.  It then clears the pattern space and repeats
       this process for each line in the input file.  Some of the subcommands use a hold space to save all or part of the pattern space for subse-
       quent retrieval.

       [Tru64 UNIX]  If you do not specify an argument to the sed command, the sed usage string is displayed.

       When a command includes an address, either a line number or a search pattern, only the addressed line or lines are affected by the command.
       Otherwise, the command is applied to all lines.

       An address is either a decimal line number, a $, which addresses the last line of input, or a context address.	A  context  address  is  a
       basic regular expression (BRE) as described for grep, except that you can select the character delimiter for patterns.  The general form of
       the expression is as follows: ?pattern?

       The ?  represents a character delimiter you select.  The backslash () is required when you select a delimiter other than the default slash
       (/) character.  This delimiter cannot be a 2-byte international character support extended character.

       The default form for the pattern is as follows: /pattern/

       In  a  context  address,  the  construction cexpressionc, where c is any character other than a backslash () or the newline character, is
       identical to /expression/.  If the character designated by c appears following a  (backslash), then it is considered to  be  that  literal
       character,  which  does not terminate the RE.  For example, in the context address xabcxdefx, the second x stands for itself, so that the
       regular expression is abcxdef.  The sequence 
 matches a newline character in the pattern space, except the terminating new line.   A  dot
       (.) matches any character except a terminating newline character.  That is, unlike grep, which cannot match a newline character in the mid-
       dle of a line, sed can match a newline character in the pattern space.

       Certain commands allow you to specify one line or a range of lines to which the command applies.  These commands are called addressed  com-
       mands.	The  following	rules  apply  to  addressed  commands: A command line with no address selects every line.  A command line with one
       address, expressed in context form, selects each line that matches the address.	A command line with two addresses separated by a comma (,)
       or semicolon (;) selects the entire range from the first line that matches the first address through the next line that matches the second.
       (If the second address is a number less than or equal to the line number first selected,  only  one  line  is  selected.)  Thereafter,  the
       process is repeated, looking again for the first address.

   Subcommands
       Backslashes  in	text  are treated like backslashes in the replacement string of an s command and can be used to protect initial spaces and
       tabs against the stripping that is done on every script line.

       The text argument accompanying the a, c, and i commands can continue onto more than one line, provided all lines but the last end with a
        (backslash) to quote the newline character.

       The  read_file  and  write_file	arguments must end the command line and must be preceded by exactly one space.	Each write_file is created
       before processing begins.

       [Tru64 UNIX]  The sed command can process up to 999 commands in a file.

       In the following list of subcommands, the maximum number of permissible addresses for each subcommand is indicated in parentheses.  The sed
       script  subcommands  are  as follows: Groups subcommands enclosed in { } (braces).  The { (left brace) can be preceded by spaces and can be
       followed by spaces or tabs.  The list of subcommands must be separated by newline characters.  The subcommands can also be preceded by spa-
       ces  or	tabs.	The  terminating } (right brace) must be preceded by a newline character and then zero or more spaces.	Places text on the
       output before reading the next input line.

	      The total number of a and/or r subcommands should not exceed 20.	Branches to the : command bearing the label.  If label	is  empty,
	      it  branches  to the end of the script.  The labels should not exceed 8 characters in length.  Label names should not be duplicated.
	      The maximum number of labels allowed in the sed script is 50.  Deletes the pattern space.  With a 0 or 1 address or at the end of  a
	      2-address  range, places text on the output.  Then it starts the next cycle.  Deletes the pattern space, then starts the next cycle.
	      Deletes the initial segment of the pattern space through the first newline character.  Then it starts the next cycle.  Replaces  the
	      contents	of  the  pattern  space with the contents of the hold space.  Appends the contents of the hold space to the pattern space.
	      Replaces the contents of the hold space with the contents of the pattern space.  Appends the contents of the pattern  space  to  the
	      hold  space.  Writes text to standard output before reading the next line into the pattern space.  Writes the pattern space to stan-
	      dard output, showing nonprinting characters as 3-digit octal values.  Long lines are folded, with the point of folding indicated	by
	      <Backslash><Return>.  The end of each line is marked with a $.

	      Certain characters are shown as escape sequences as follows: Backslash Alert Backspace Formfeed Newline Carriage-return Tab Vertical
	      tab Writes the pattern space to standard output.	It replaces the pattern space with the next line of input.  Appends the next  line
	      of  input  to  the pattern space with an embedded newline character.  (The current line number changes.)	You can use this to search
	      for patterns that are split onto two lines.  Writes the pattern space to standard output.  Writes the initial segment of the pattern
	      space  through  the  first newline character to standard output.	Branches to the end of the script.  It does not start a new cycle.
	      Reads the contents of read_file.	It places contents on the output before reading the next input line.

	      The total number of a and/or r subcommands should not exceed 20.	Substitutes the replacement string for the first occurrence of the
	      pattern  in the pattern space.  Any character that is entered after the s command can substitute for the / (slash) separator, except
	       (backslash) and the newline character.	Within the regular expression and replacement string, the delimiter can appear as  a  lit-
	      eral if it is preceded by a  (backslash).

	      An  &  (ampersand)  appearing in the replacement string is replaced by the string matching the RE.  The special meaning of & in this
	      context can be suppressed by preceding it with a  (backslash).  The characters 
, where n is a digit, are  replaced  by  the  text
	      matched by the corresponding backreference expression.

	      A line can be split by substituting a newline character into it.	You must escape the newline character in the replacement string by
	      preceding it with a  backslash.	A substitution is considered to have been performed even if the replacement string is identical to
	      the string that it replaces.

	      You can add zero or more of the following flags: Where n is 1-512, substitutes replacement for the nth occurrence of pattern on each
	      addressed line, rather than for the first occurrence.  Substitutes replacement for all nonoverlapping instances of pattern  on  each
	      addressed  line, rather than for just the first one (or for the one specified by n).  Writes the pattern space to standard output if
	      a replacement was made.

	      [SVR4]  If the environment variable CMD_ENV is set either to SVR4 or svr4, writes the substituted pattern space to  standard  output
	      only  once at the end of the script, unless the -n option is specified.  Writes the pattern space to write_file if a replacement was
	      made.  Appends the pattern space to write_file. If write_file was not already created by a previous write by this  sed  script,  sed
	      creates it.  Each write_file is created before processing begins.

	      A  maximum number of 10 files can be created by sed.  Branches to :label in the script file if any substitutions were made since the
	      most recent reading of an input line execution of a t subcommand.  If you do not specify label, control transfers to the end of  the
	      script.	Appends  the  pattern  space to write_file.  Exchanges the contents of the pattern space and the hold space.  Replaces all
	      occurrences of characters in pattern1 with the corresponding characters from pattern2. The byte lengths  of  pattern1  and  pattern2
	      must be equal.  Applies the specified sed subcommand only to lines not selected by the address or addresses.  This script entry sim-
	      ply marks a branch point to be referenced by the b and t commands.  This label can be any sequence of eight or fewer bytes.   Writes
	      the current line number to standard output as a line.  Groups subcommands enclosed in { } (braces).  Ignores an empty command.  If a
	      # (number sign) appears as the first character on the first line of a script file, that entire line is treated as  a  comment,  with
	      one  exception.	If  the  character  after  the	# is an n, the default output is suppressed.  The rest of the line after the #n is
	      ignored.	A script must contain at least one noncomment line.

RESTRICTIONS

       [Tru64 UNIX]  The h subcommand for sed does not work properly.  When you use the h subcommand to place text into the hold  area,  only  the
       last  line  of the specified text is saved.  You can use the H subcommand to append text to the hold area.  The H subcommand and all others
       dealing with the hold area work correctly.

EXIT STATUS

       The following exit values are returned: Successful completion.  An error occurred.

EXAMPLES

       To perform a global change, enter: sed "s/happy/enchanted/g" chap1 >chap1.new

	      This replaces each occurrence of happy found in the file chap1 with enchanted, and puts the edited version in a separate file  named
	      chap1.new.   The	g at the end of the s subcommand tells sed to make as many substitutions as possible on each line.  Without the g,
	      sed replaces only the first happy on a line.

	      The sed stream editor operates as a filter.  It reads text from standard input or from the files named on the command line (chap1 in
	      this  example),  modifies  this text, and writes it to standard output.  Unlike most editors, it does not replace the original file.
	      This makes sed a powerful command when used in pipelines.  To use sed as a filter in a pipeline (sh only), enter:  pr  chap2  |  sed
	      "s/Page *[0-9]*$/(&)/" | print

	      This  encloses  the  page numbers in parentheses before printing chap2.  The pr command puts a heading and page number at the top of
	      each page, then sed puts the page numbers in parentheses, and the print command prints the edited listing.

	      The sed pattern /Page *[0-9]*$/ matches page numbers that appear at the end of a line.  The s subcommand changes this to (&),  where
	      the  & stands for the pattern that was matched (for example, Page  5).  To display selected lines of a file, enter: sed -n "/food/p"
	      chap3

	      This displays each line in chap3 that contains the word food.  Normally, sed copies every  line  to  standard  output  after  it	is
	      edited.  The -n option stops sed from doing this.  You then use subcommands like p to write specific parts of the text.  Without the
	      -n, this example displays all the lines in chap3, and it shows each line containing food twice.  To perform complex editing,  enter:
	      sed -f script.sed chap4 >chap4.new

	      It is always a good idea to create a sed script file when you want to do anything complex.  You can then test and modify your script
	      before using it.	You can also reuse your script to edit other files.  Create the script file with an interactive  text  editor.	 A
	      sample sed script follows:

	      :join /\$/{N s/\
// b join }

	      This  sed script joins each line that ends with a  (backslash) to the line that follows it. First, the pattern /\$/ selects a line
	      that ends with a  for the group of commands enclosed in { }.  The N subcommand then appends the	next  line,  embedding	a  newline
	      character.  The s/\
// deletes the  (backslash) and embedded newline character.  Finally, b join branches back to the label :join
	      to check for a  (backslash) at the end of the newly joined line.  Without the branch, sed writes the joined line and reads the next
	      one before checking for a second  character.

	      The N subcommand causes sed to stop immediately if there are no more lines of input (that is, if N reads the End-of-File character).
	      It does not copy the pattern space to standard output before stopping.  This means that if the last line of the input ends with a  
	      (backslash) character, then it is not copied to the output.

ENVIRONMENT VARIABLES

       The  following  environment variables affect the execution of sed: Provides a default value for the internationalization variables that are
       unset or null. If LANG is unset or null, the corresponding value from the default locale is used.  If any of the internationalization vari-
       ables  contain  an  invalid setting, the utility behaves as if none of the variables had been defined.  If set to a non-empty string value,
       overrides the values of all the other internationalization variables.  Determines the locale for the interpretation of sequences  of  bytes
       of text data as characters (for example, single-byte as opposed to multibyte characters in arguments) and the behavior of character classes
       within regular expressions.  Determines the locale for the format and contents of diagnostic messages written to  standard  error.   Deter-
       mines the location of message catalogues for the processing of LC_MESSAGES.

SEE ALSO

       Commands:  awk(1), ed(1), grep(1), vi(1)

       Standards:  standards(5)

       Programming Support Tools

																	    sed(1)

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

remove duplicated columns

Discussion started by: kamel.seg

2. Shell Programming and Scripting

Help removing lines with duplicated columns

Discussion started by: yahyaaa

3. Shell Programming and Scripting

How to get rid of double quote in sed.

Discussion started by: anakiar

4. Shell Programming and Scripting

get rid of xml comment by grep or sed

Discussion started by: RonLii

5. UNIX for Dummies Questions & Answers

Getting rid of selected columns

Discussion started by: zajtat

6. Shell Programming and Scripting

Manipulate columns using sed

Discussion started by: stoyanova

7. UNIX for Dummies Questions & Answers

Find duplicated values in two columns out of three

Discussion started by: kush