07-15-2010
Remove duplicates based on a column in fixed width file
Hi,
How to output the duplicate record to another file. We say the record is duplicate based on a column whose position is from 2 and its length is 11 characters.
The file is a fixed width file.
ex of Record:
DTYU12333567opert tjhi kkklTRG9012
The data in bold is the key on which the duplicates are identified.
Thanks.
9 More Discussions You Might Find Interesting
1. Shell Programming and Scripting
Hi, Iam new to unix. I have one input file .
Input file :
ID1~Name1~Place1
ID2~Name2~Place2
ID3~Name3~Place3
I need output such that only first column should change to fixed width column of 15 characters of length.
Output File:
ID1<<12 spaces>>Name1~Place1
ID2<<12... (5 Replies)
Discussion started by: manneni prakash
5 Replies
2. Shell Programming and Scripting
Hi,
I have a fixed width text file without any header row. One of the columns contains a date in YYYYMMDD format.
If the original file contains 3 dates, I want my shell script to split the file into 3 small files with data for each date.
I am a newbie and need help doing this. (14 Replies)
Discussion started by: bhanja_trinanja
14 Replies
3. Shell Programming and Scripting
Given a file such as this I need to remove the duplicates.
00060011 PAUL BOWSTEIN ad_waq3_921_20100826_010517.txt
00060011 PAUL BOWSTEIN ad_waq3_921_20100827_010528.txt
0624-01 RUT CORPORATION ad_sade3_10_20100827_010528.txt
0624-01 RUT CORPORATION ... (13 Replies)
Discussion started by: script_op2a
13 Replies
4. Shell Programming and Scripting
Hello,
I am new to shell scripting. I have a huge file with multiple columns for example:
I have 5 columns below.
HWUSI-EAS000_29:1:105 + chr5 76654650 AATTGGAA HHHHG
HWUSI-EAS000_29:1:106 + chr5 76654650 AATTGGAA B@HYL
HWUSI-EAS000_29:1:108 + ... (4 Replies)
Discussion started by: Diya123
4 Replies
5. Shell Programming and Scripting
Hi All ,
I have a requirement where I need to remove duplicates from a fixed width file which has multiple key columns .Also , need to capture the duplicate records into another file .
File has 8 columns.
Key columns are col1 and col2.
Col1 has the length of 8 col 2 has the length of 3.
... (5 Replies)
Discussion started by: saj
5 Replies
6. Shell Programming and Scripting
Hi,
I am unable to find the right option to extract the data in the fixed width file.
sample data
abcd1234xgyhsyshijfkfk
hujk9876 io xgla
loki8787eljuwoejroiweo
dkfj9098 dja
Search based on position 8-9="xg" and print the entire row
output
... (4 Replies)
Discussion started by: onesuri
4 Replies
7. Shell Programming and Scripting
I have a fixed with file with header & trailer length having the same length of the detail record file.
The details record length of this file is 24, for Header and Trailer the records will be padded with spaces to match the record length of the file
Currently I am adding 3 spaces in header... (14 Replies)
Discussion started by: ginrkf
14 Replies
8. Shell Programming and Scripting
Hi All,
I am trying to select the rows in a fixed width file based on values in the columns.
I want to select only the rows if column position 3-4 has the value AB
I am using cut command to get the column values. Is it possible to check if cut -c3-4 = AB is true then select only that... (2 Replies)
Discussion started by: ashok.k
2 Replies
9. Shell Programming and Scripting
Hi Forum.
I tried searching for a solution using the internet search but I haven't been able to find any solution for what I'm trying to accomplish.
I have a fixed width column file where I need to search for any occurrences of "D0" in col pos.#1-2, 10-11, 20-21 and replaced it with "XD".
... (2 Replies)
Discussion started by: pchang
2 Replies
fold(1) General Commands Manual fold(1)
NAME
fold - fold long lines for finite width output device
SYNOPSIS
width] [file ...]
Obsolete form:
width] [file ...]
DESCRIPTION
The command is a filter that folds the contents of the specified files, breaking the lines to have a maximum of width column positions (or
bytes, if the option is specified). The command breaks lines by inserting a newline character so that each output line is the maximum
width possible that does not exceed the specified number of column positions (or bytes). A line cannot be broken in the middle of a char-
acter. If no files are specified or if a file name of is specified, the standard input is used.
The command is often used to send text files to line printers that truncate, rather than fold, lines wider than the printer is able to
print.
If the backspace, tab, or carriage-return characters are encountered in the input, and the option is not specified, they are treated spe-
cially as follows:
Backspace The current count of line width is decremented by one, although the count never becomes negative. Thus, the char-
acter sequence character-backspace-character counts as using one column position, assuming both characters each
occupy a single column position. does not insert a newline character immediately before or after any backspace
character.
Tab Each tab character encountered advances the column position pointer to the next tab stop. Tab stops are set 8 col-
umns apart at column positions 1, 9, 17, 25, 33, etc.
Carriage-return The current count of line width is set to zero. does not insert a newline character immediately before or after
any carriage-return character.
Note that may affect any underlining that is present.
Options
The command recognizes the following options and command-line arguments:
Count width in bytes rather than in column positions.
Break the line on the last blank character found
before the specified number of column positions (or bytes). If none are found, break the line at the specified
line length.
Specify the maximum line length, in column positions (or bytes if
is specified). The default value is 80. width should be a multiple of 8 if tabs are present, or the tabs should
be expanded using before processing by (see expand(1)). The option is obsolescent and may be removed in a future
release.
EXTERNAL INFLUENCES
Environment Variables
determines the interpretation of text as single- and/or multi-byte characters.
determines the language in which messages are displayed.
If or is not specified in the environment or is set to the empty string, the value of is used as a default for each unspecified or empty
variable. If is not specified or is set to the empty string, a default of "C" (see lang(5)) is used instead of
If any internationalization variable contains an invalid setting, behaves as if all internationalization variables are set to "C". See
environ(5).
International Code Set Support
Single- and multi-byte character code sets are supported.
SEE ALSO
expand(1).
STANDARDS CONFORMANCE
fold(1)