UNIX scripting for finding duplicates and null records in pk columns
Hi,
I have a requirement.for eg: i have a text file with pipe symbol as delimiter(|) with 4 columns a,b,c,d. Here a and b are primary key columns..
i want to process that file to find the duplicates and null values are in primary key columns(a,b) . I want to write the unique records in which pks are not null into one file.. and the duplicate records,the records havin pk columns as null into another file.
sample.. input:abc.txt
ouput file 1 : unique records
ouput file 2 : duplicate records and records with pk columns null
pls help me to achieve thisusing unix script.
Thanks
Last edited by Don Cragun; 05-10-2014 at 06:33 PM..
Reason: Add CODE tags.
I have a huge file (over 30mb) that I am processing through with perl. I am pulling out a list of filenames and placing it in an array called @reports.
I am fine up till here. What I then want to do is go through the array and find any duplicates. If there is a duplicate, output it to the screen.... (3 Replies)
I am trying to figure out how to scan a file like so:
1 ralphs office","555-555-5555","ralph@mail.com","www.ralph.com
2 margies office","555-555-5555","ralph@mail.com","www.ralph.com
3 kims office","555-555-5555","kims@mail.com","www.ralph.com
4 tims... (17 Replies)
I have million's of records each containing exactly 50 characters and have to check the uniqueness of 4 character substring of 50 character (postion known prior) and report if any duplicates are found.
Eg. data...
AAAA00000000000000XXXX0000 0000000000... upto50 chars... (2 Replies)
Hi,
can I do something like this to add a condition of checking if the 4th field is number or space or blank also:
awk -F, '$4 /^*||*/' MYFILE >> OTHERFILE
I also want the other part i.e. I need to exclude all lines whose 4th field is space or blank or number:
MYFILE
a,b,c,d,e
a,b,c,2,r... (2 Replies)
Hi,
I have a pipe seperated file
I want to write a code to display count of lines that have 20th field not null.
nawk -F"|" '{if ($20!="") print NR,$20}' xyz..txt
This displays records with 20th field also null.
I would like output as: (4 Replies)
I was trying to use the AIX 6.1 sort command to sort fixed-length data records, sorting by specific columns only. It took some time to figure out how to get it to work, so I wanted to share the solution. The sort man page wasn't much help, because it talks about field delimeters (default space... (1 Reply)
I am currently creating a script to find filenames that are listed once in an input file (find non duplicates). I then want to report those single files in another file. Here is the function that I have so far:
function dups_filenames
{
file2=""
file1=""
file=""
dn=""
ch=""
pn=""
... (6 Replies)
Hi team,
I have 20 columns csv files. i want to find the duplicates in that file based on the column1 column10 column4 column6 coulnn8 coulunm2 . if those columns have same values . then it should be a duplicate record.
can one help me on finding the duplicates,
Thanks in advance.
... (2 Replies)
Hi everyone. I'm trying to help my wife with a project, she has exported 200 images from many different folders, unfortunately there was a problem with the export and I need to find the master versions so that she doesn't have to go through and select them again.
I need to:
For each image in... (2 Replies)
Discussion started by: Rhinoskin
2 Replies
LEARN ABOUT LINUX
lsns
LSNS(8) System Administration LSNS(8)NAME
lsns - list namespaces
SYNOPSIS
lsns [options] [namespace]
DESCRIPTION
lsns lists information about all the currently accessible namespaces or about the given namespace. The namespace identifier is an inode
number.
The default output is subject to change. So whenever possible, you should avoid using default outputs in your scripts. Always explicitly
define expected columns by using the --output option together with a columns list in environments where a stable output is required.
Note that lsns reads information directly from the /proc filesystem and for non-root users it may return incomplete information. The cur-
rent /proc filesystem may be unshared and affected by a PID namespace (see unshare --mount-proc for more details). lsns is not able to see
persistent namespaces without processes where the namespace instance is held by a bind mount to /proc/pid/ns/type.
OPTIONS -J, --json
Use JSON output format.
-l, --list
Use list output format.
-n, --noheadings
Do not print a header line.
-o, --output list
Specify which output columns to print. Use --help to get a list of all supported columns.
The default list of columns may be extended if list is specified in the format +list (e.g. lsns -o +PATH).
-p, --task pid
Display only the namespaces held by the process with this pid.
-r, --raw
Use the raw output format.
-t, --type type
Display the specified type of namespaces only. The supported types are mnt, net, ipc, user, pid, uts and cgroup. This option may
be given more than once.
-u, --notruncate
Do not truncate text in columns.
-V, --version
Display version information and exit.
-h, --help
Display help text and exit.
AUTHORS
Karel Zak <kzak@redhat.com>
SEE ALSO nsenter(1), unshare(1), clone(2), namespaces(7)AVAILABILITY
The lsns command is part of the util-linux package and is available from https://www.kernel.org/pub/linux/utils/util-linux/.
util-linux December 2015 LSNS(8)