Sponsored Content
Top Forums UNIX for Dummies Questions & Answers Remove duplicates based on a column in fixed width file Post 302437438 by Qwerty123 on Thursday 15th of July 2010 05:45:57 AM
Old 07-15-2010
Remove duplicates based on a column in fixed width file

Hi,

How to output the duplicate record to another file. We say the record is duplicate based on a column whose position is from 2 and its length is 11 characters.
The file is a fixed width file.

ex of Record:

DTYU12333567opert tjhi kkklTRG9012

The data in bold is the key on which the duplicates are identified.

Thanks.
 

9 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Changing one column of delimited file column to fixed width column

Hi, Iam new to unix. I have one input file . Input file : ID1~Name1~Place1 ID2~Name2~Place2 ID3~Name3~Place3 I need output such that only first column should change to fixed width column of 15 characters of length. Output File: ID1<<12 spaces>>Name1~Place1 ID2<<12... (5 Replies)
Discussion started by: manneni prakash
5 Replies

2. Shell Programming and Scripting

How to split a fixed width text file into several ones based on a column value?

Hi, I have a fixed width text file without any header row. One of the columns contains a date in YYYYMMDD format. If the original file contains 3 dates, I want my shell script to split the file into 3 small files with data for each date. I am a newbie and need help doing this. (14 Replies)
Discussion started by: bhanja_trinanja
14 Replies

3. Shell Programming and Scripting

need to remove duplicates based on key in first column and pattern in last column

Given a file such as this I need to remove the duplicates. 00060011 PAUL BOWSTEIN ad_waq3_921_20100826_010517.txt 00060011 PAUL BOWSTEIN ad_waq3_921_20100827_010528.txt 0624-01 RUT CORPORATION ad_sade3_10_20100827_010528.txt 0624-01 RUT CORPORATION ... (13 Replies)
Discussion started by: script_op2a
13 Replies

4. Shell Programming and Scripting

remove duplicates based on single column

Hello, I am new to shell scripting. I have a huge file with multiple columns for example: I have 5 columns below. HWUSI-EAS000_29:1:105 + chr5 76654650 AATTGGAA HHHHG HWUSI-EAS000_29:1:106 + chr5 76654650 AATTGGAA B@HYL HWUSI-EAS000_29:1:108 + ... (4 Replies)
Discussion started by: Diya123
4 Replies

5. Shell Programming and Scripting

Removing duplicates in fixed width file which has multiple key columns

Hi All , I have a requirement where I need to remove duplicates from a fixed width file which has multiple key columns .Also , need to capture the duplicate records into another file . File has 8 columns. Key columns are col1 and col2. Col1 has the length of 8 col 2 has the length of 3. ... (5 Replies)
Discussion started by: saj
5 Replies

6. Shell Programming and Scripting

Fixed width file search based on position value

Hi, I am unable to find the right option to extract the data in the fixed width file. sample data abcd1234xgyhsyshijfkfk hujk9876 io xgla loki8787eljuwoejroiweo dkfj9098 dja Search based on position 8-9="xg" and print the entire row output ... (4 Replies)
Discussion started by: onesuri
4 Replies

7. Shell Programming and Scripting

To replace the value of the column in a fixed width file

I have a fixed with file with header & trailer length having the same length of the detail record file. The details record length of this file is 24, for Header and Trailer the records will be padded with spaces to match the record length of the file Currently I am adding 3 spaces in header... (14 Replies)
Discussion started by: ginrkf
14 Replies

8. Shell Programming and Scripting

UNIX command -Filter rows in fixed width file based on column values

Hi All, I am trying to select the rows in a fixed width file based on values in the columns. I want to select only the rows if column position 3-4 has the value AB I am using cut command to get the column values. Is it possible to check if cut -c3-4 = AB is true then select only that... (2 Replies)
Discussion started by: ashok.k
2 Replies

9. Shell Programming and Scripting

Search and replace value based on certain conditions in a fixed width file

Hi Forum. I tried searching for a solution using the internet search but I haven't been able to find any solution for what I'm trying to accomplish. I have a fixed width column file where I need to search for any occurrences of "D0" in col pos.#1-2, 10-11, 20-21 and replaced it with "XD". ... (2 Replies)
Discussion started by: pchang
2 Replies
fold(1) 							   User Commands							   fold(1)

NAME
fold - filter for folding lines SYNOPSIS
fold [-bs] [-w width | -width] [file...] DESCRIPTION
The fold utility is a filter that will fold lines from its input files, breaking the lines to have a maximum of width column positions (or bytes, if the -b option is specified). Lines will be broken by the insertion of a NEWLINE character such that each output line (referred to later in this section as a segment) is the maximum width possible that does not exceed the specified number of column positions (or bytes). A line will not be broken in the middle of a character. The behavior is undefined if width is less than the number of columns any single character in the input would occupy. If the CARRIAGE-RETURN, BACKSPACE, or TAB characters are encountered in the input, and the -b option is not specified, they will be treated specially: BACKSPACE The current count of line width will be decremented by one, although the count never will become negative. fold will not insert a NEWLINE character immediately before or after any BACKSPACE character. CARRIAGE-RETURN The current count of line width will be set to 0. fold will not insert a NEWLINE character immediately before or after any CARRIAGE-RETURN character. TAB Each TAB character encountered will advance the column position pointer to the next tab stop. Tab stops will be at each column position n such that n modulo 8 equals 1. OPTIONS
The following options are supported: -b Counts width in bytes rather than column positions. -s If a segment of a line contains a blank character within the first width column positions (or bytes), breaks the line after the last such blank character meeting the width constraints. If there is no blank character meeting the requirements, the -s option will have no effect for that output segment of the input line. -w width|-width Specifies the maximum line length, in column positions (or bytes if -b is specified). If width is not a positive decimal number, an error is returned. The default value is 80. OPERANDS
The following operand is supported: file A path name of a text file to be folded. If no file operands are specified, the standard input will be used. EXAMPLES
Example 1: Submitting a file of possibly long lines to the line printer An example invocation that submits a file of possibly long lines to the line printer (under the assumption that the user knows the line width of the printer to be assigned by lp(1)): example% fold -w 132 bigfile | lp ENVIRONMENT VARIABLES
See environ(5) for descriptions of the following environment variables that affect the execution of fold: LANG, LC_ALL, LC_CTYPE, LC_MES- SAGES, and NLSPATH. EXIT STATUS
The following exit values are returned: 0 All input files were processed successfully. >0 An error occurred. ATTRIBUTES
See attributes(5) for descriptions of the following attributes: +-----------------------------+-----------------------------+ | ATTRIBUTE TYPE | ATTRIBUTE VALUE | +-----------------------------+-----------------------------+ |Availability |SUNWcsu | +-----------------------------+-----------------------------+ |CSI |enabled | +-----------------------------+-----------------------------+ |Interface Stability |Standard | +-----------------------------+-----------------------------+ SEE ALSO
cut(1), pr(1), attributes(5), environ(5), standards(5) NOTES
fold and cut(1) can be used to create text files out of files with arbitrary line lengths. fold should be used when the contents of long lines need to be kept contiguous. cut should be used when the number of lines (or records) needs to remain constant. fold is frequently used to send text files to line printers that truncate, rather than fold, lines wider than the printer is able to print (usually 80 or 132 column positions). fold may not work correctly if underlining is present. SunOS 5.10 1 Feb 1995 fold(1)
All times are GMT -4. The time now is 02:06 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy