Remove the partial duplicates by checking the length of a field
Hi Folks -
I'm quite new to awk and didn't come across such issues before. The problem statement is that, I've a file with duplicate records in 3rd and 4th fields. The sample is as below:
Here,the combination of field3 and field is same for few records viz. 4556 for the first 2 and last rows and so on..
Now,the output file is expected to be like this:
That is, checking the length of first field for the rows where field3&field4 match and return the row with highest length in first field among them. So, one row will be picked from each set of duplicates based on the length on first field
Could you please help with a one line awk command to achieve this?
Hi all,
I am due to start receiving a weekly csv containing around 6 million rows. I need to do some processing on this file and then send it on elsewhere.
My problem is that after week 1 the files that I will receive are likely to contain data already received in previous files and I need... (8 Replies)
Hello, this is probably a simple request but I've been toying with it for a while.
I have a large list of devices and commands that were run with a script, now I have lines such as:
a-router-hostname-C#show ver
I want to print everything up to (and excluding) the # and everything after it... (3 Replies)
Hi
Description of input file I have:
-------------------------
1) CSV with double quotes for string fields.
2) Some string fields have Comma as part of field value.
3) Have Duplicate lines
4) Have 200 columns/fields
5) File size is more than 10GB
Description of output file I need:... (4 Replies)
Hi,
I have a file with fields like below:
A;XYZ;102345;222
B;XYZ;123243;333
C;ABC;234234;444
D;MNO;103345;222
E;DEF;124243;333
desired output:
C;ABC;234234;444
D;MNO;103345;222
E;DEF;124243;333
ie, if the 4rth field is a duplicate.. i need only those records where... (5 Replies)
Hello Everyone,
I am stuck with one issue while working on abstract flat file which i have to use as input and load data to table.
Input Data-
------ ------------------------ ---- -----------------
WFI001 Xxxxxx Control Work Item A Number of Records
------ ------------------------... (5 Replies)
Hi All,
I have a text file with three columns. I would like a simple script that removes lines in which column 1 has duplicate entries, but use the largest value in column 3 to decide which one to keep. For example:
Input file:
12345a rerere.rerere len=23
11111c fsdfdf.dfsdfdsf len=33 ... (3 Replies)
Hi all,
I have a requirement to replace a field with a character as per the length of the field.
Suppose i have a file where second field is of 20 character length. I want to replace second field with 20 stars (*). like ********************
As the field is not a fixed one, i want to do the... (2 Replies)
I am trying to see if I can use awk to remove duplicates from a file. This is the file:
-==> Listvol <==
deleting /vol/eng_rmd_0941
deleting /vol/eng_rmd_0943
deleting /vol/eng_rmd_0943
deleting /vol/eng_rmd_1006
deleting /vol/eng_rmd_1012
rearrange /vol/eng_rmd_0943
... (6 Replies)
Background: I use a TV tuner card to capture OTA video files (.mpeg) and then my Plex Media Server automatically optimizes the files (transcodes for better playback) and places them in a new directory. I have another Plex Library pointing to the new location for the optimized .mp4 files. This... (2 Replies)
Hello,
How can I remove partial duplicates and manipulate text in bash using either awk, grep or sed? Thanks.
Input:
ted,"foo,bar,zoo"
john-son,"foot,ben,zoo"
bob,"bar,foot"
Expected Output:
foo,ted
bar,ted
zoo,ted
foot,john-son
ben,john-son (4 Replies)
Discussion started by: tara123
4 Replies
LEARN ABOUT DEBIAN
checkbashisms
CHECKBASHISMS(1) General Commands Manual CHECKBASHISMS(1)NAME
checkbashisms - check for bashisms in /bin/sh scripts
SYNOPSIS
checkbashisms script ...
checkbashisms --help|--version
DESCRIPTION
checkbashisms, based on one of the checks from the lintian system, performs basic checks on /bin/sh shell scripts for the possible presence
of bashisms. It takes the names of the shell scripts on the command line, and outputs warnings if possible bashisms are detected.
Note that the definition of a bashism in this context roughly equates to "a shell feature that is not required to be supported by POSIX";
this means that some issues flagged may be permitted under optional sections of POSIX, such as XSI or User Portability.
In cases where POSIX and Debian Policy disagree, checkbashisms by default allows extensions permitted by Policy but may also provide
options for stricter checking.
OPTIONS --help, -h
Show a summary of options.
--newline, -n
Check for "echo -n" usage (non POSIX but required by Debian Policy 10.4.)
--posix, -p
Check for issues which are non POSIX but required to be supported by Debian Policy 10.4 (implies -n).
--force, -f
Force each script to be checked, even if it would normally not be (for instance, it has a bash or non POSIX shell shebang or appears
to be a shell wrapper).
--extra, -x
Highlight lines which, whilst they do not contain bashisms, may be useful in determining whether a particular issue is a false posi-
tive which may be ignored. For example, the use of "$BASH_ENV" may be preceded by checking whether "$BASH" is set.
--version, -v
Show version and copyright information.
EXIT VALUES
The exit value will be 0 if no possible bashisms or other problems were detected. Otherwise it will be the sum of the following error val-
ues:
1 A possible bashism was detected.
2 A file was skipped for some reason, for example, because it was unreadable or not found. The warning message will give details.
SEE ALSO lintian(1).
AUTHOR
checkbashisms was originally written as a shell script by Yann Dirson <dirson@debian.org> and rewritten in Perl with many more features by
Julian Gilbey <jdg@debian.org>.
DEBIAN Debian Utilities CHECKBASHISMS(1)