Help - manipulate data by columns and repeated Post: 302978049

Sponsored Content

Top Forums Shell Programming and Scripting Help - manipulate data by columns and repeated Post 302978049 by Don Cragun on Monday 25th of July 2016 04:05:40 PM

07-25-2016

Registered User

Here is another way to do what rdrtx1 was doing just using awk to create two output files and cp to copy the updated version of the input file back to the input file when it is done. Of course, both of these suggestions depend on entries in your input files always being in increasing time order (as in your sample data):

Code:

#!/bin/ksh
# We can't use awk to overwrite the input file directly, so we create a
# temporary output file with the lines from the input file that are to be kept
# and a duplicate output file with the lines for names that appear two or more
# times in the input file.
#
# When awk compltes, if it was successful, we'll copy the temporary output file
# back to the input file.  Otherwise, the input file will not be changed.

InFile="Dato01.txt"		# Name the input file.
DupFile="Dato02.txt"		# Name the output file for duplicates.
TempFile="$InFile.$$"		# Name the temporary output file.

trap 'rm -f "$TempFile"' EXIT	# When the script completes, remove the temp file.

awk -v new="$TempFile" -v dup="$DupFile" '
NR == 1 {
	# Copy the header line from the input file to both output files.
	print > new
	print > dup
	next
}
{	if($1 in seen) {
		# We have seen this person before.  Copy this line to the
		# duplicates file.
		print > dup
	} else {
		# We have not seen this person before.  Copy this line to the
		# temporary file (which will replace the input file when we are
		# done).
		print > new

		# Note that we have seen this person.
		seen[$1]
	}
}' "$InFile" > "$TempFile" && cp "$TempFile" "$InFile"

This was written and tested using a Korn shell, but this should work with any shell that uses basic Bourne shell syntax (including ash. bash, dash, ksh, zsh, and several others; but not csh and its derivatives).

If you want to try this on a Solaris/SunOS system, change awk to /usr/xpg4/bin/awk or nawk.

Don Cragun

View Public Profile for Don Cragun

Find all posts by Don Cragun

10 More Discussions You Might Find Interesting

1. Filesystems, Disks and Memory

manipulate csv file to add columns

Hi, I have a csv file with a key composed by 3 columns and some other numeric fields and I need to obtain the partial amounts by some part of the key. This may be some difficult to understand, so better see an example, where my input file is: name,surname,department,y2004,y2005,y2006...

2. UNIX for Dummies Questions & Answers

Excel data manipulate

All, I have the following format of data in a spreadsheet A 1 2 3 4 B 1 2 3 4 where 'A' is value of 'A1', '1 2 3 4' is value of cell B1, 'B' is value of cell A2, and '1 2 3 4' is value of cell B2. There...

3. Shell Programming and Scripting

comparing the values of repeated keys in multiple columns

Hi Guyz The 1st column of the input file has repeated keys like x,y and z. The ist task is if the 1st column has unique key (say x) and then need to consider 4th column, if it is + symbol then subtract 2nd column value with 3rd column value (we will get 2(10-8)) or if it is - symbol subtract 3rd...

4. Shell Programming and Scripting

how to manipulate with lines while playing with data

hello everyone, well I have a file which contains data, I want to add the data on hourly basis, like my file contains data for 24 hours, (so a total of 1440 ) lines. Now i want to add the data on hourly basis to get average values. like if I use (head) command it is ok for first go, but...

5. Shell Programming and Scripting

Converted repeated rows into splitted columns

Dear Friends, I have an input file contains lot of datas, which is like repaeated rows report. The output file need to have column wise report, rather than row-wise. Input File random line 1 random line 2 random line 3 ------------------------------------- Start line 1.1 (9.9) ...

6. Shell Programming and Scripting

Manipulate columns using sed

Hello, I would like to remove the first column of lines beginning by a character (in my case is an open square bracket) and finishing by a space (or any other delimiter). For example: string1 string2 string3 to string2 string3 I found this previous topic: ...

7. UNIX for Dummies Questions & Answers

Manipulate and move columns in a file

Hello Unix Gurus, I have a request 2 perform several functions on a file, delete columns, delete rows based on column value, and finally move around columns in the final output. Consider the following input file with 12 columns; ...

8. Shell Programming and Scripting

Transposing Repeated Rows to Columns.

I have 1000s of these rows that I would like to transpose to columns. However I would like the transpose every 3 consecutive rows to columns like below, sorted by column 3 and provide a total for each occurrences. Finally I would like a grand total of column 3. 21|FE|41|0B 50\65\78 15...

9. Shell Programming and Scripting

Need help to manipulate data using script

Hi i want to manipulate my data to convert row to column name 600 Slno vlan 1 600 2 609 3 700 name 700 Slno vlan 1 600 2 609 3 700

10. Shell Programming and Scripting

Manipulate the columns of 2 files

Hello, I have two files to be treated. First file: col1 col2 col3 col4 Second file: colbis - I try to add the unique column of the file 2 towards the file 1. - To obtain the following result with a shell script ksh: col1 col2 col3 col4 colbis

LEARN ABOUT PLAN9

grep

GREP(1) 						      General Commands Manual							   GREP(1)

NAME

       grep - search a file for a pattern

SYNOPSIS

       grep [ option ...  ] pattern [ file ...	]

DESCRIPTION

       Grep  searches  the input files (standard input default) for lines (with newlines excluded) that match the pattern, a regular expression as
       defined in regexp(6).  Normally, each line matching the pattern is `selected', and each selected line is copied	to  the  standard  output.
       The options are

       -c     Print only a count of matching lines.
       -h     Do not print file name tags (headers) with output lines.
       -i     Ignore alphabetic case distinctions.  The implementation folds into lower case all letters in the pattern and input before interpre-
	      tation.  Matched lines are printed in their original form.
       -l     (ell) Print the names of files with selected lines; don't print the lines.
       -L     Print the names of files with no selected lines; the converse of -l.
       -n     Mark each printed line with its line number counted in its file.
       -s     Produce no output, but return status.
       -v     Reverse: print lines that do not match the pattern.

       Output lines are tagged by file name when there is more than one input file.  (To force this tagging, include  /dev/null  as  a	file  name
       argument.)

       Care should be taken when using the shell metacharacters $*[^|()= and newline in pattern; it is safest to enclose the entire expression in
       single quotes '...'.

SOURCE

       /sys/src/cmd/grep.c

SEE ALSO

       ed(1), awk(1), sed(1), sam(1), regexp(6)

DIAGNOSTICS

       Exit status is null if any lines are selected, or non-null when no lines are selected or an error occurs.

																	   GREP(1)

10 More Discussions You Might Find Interesting

1. Filesystems, Disks and Memory

manipulate csv file to add columns

Discussion started by: oscarmon

2. UNIX for Dummies Questions & Answers

Excel data manipulate

Discussion started by: rahulrathod

3. Shell Programming and Scripting

comparing the values of repeated keys in multiple columns

Discussion started by: repinementer

4. Shell Programming and Scripting

how to manipulate with lines while playing with data

Discussion started by: jojo123