Sponsored Content
Top Forums UNIX for Beginners Questions & Answers Print number of lines for files in directory, also print number of unique lines Post 303036810 by spacegoose on Thursday 11th of July 2019 04:19:39 PM
Old 07-11-2019
Quote:
Originally Posted by vgersh99
how about this:
Code:
#!/bin/ksh

wc -l * | sed '$d' | sort | while read lines file junk
do
   echo $lines $(sort < $file | uniq -u |wc -l) $file
done

FYI - Just tried this, it printed correct counts but unique counts were off. I will check the others and update.

Quote:
Originally Posted by nezabudka
Code:
awk '{u[$0]; l++} ENDFILE {print length(u), l, FILENAME; delete u; l=0}' * | sort -k1,1n

Thanks nezabudka!! This seems to work with gawk -- thanks also vgersh99 for pointing out gawk -- tried your different gawk but counts still off ... as in your original solution -- maybe uniq is not being done in correct order?



Quote:
Originally Posted by Don Cragun
Please always tell us what shell and operating system you're using when you start a new thread. Don't assume that everyone who wants to help you has read all of your previous threads.
Code:
#!/bin/bash
tmpf="/tmp/$$.result"

trap 'rm -f "$tmpf"' EXIT

awk '
function dump() {
	print linecount, distinct, lastfile
	linecount = distinct = 0
	split("", lines)
}

FILENAME != lastfile {
	if(lastfile)
		dump()
	lastfile = FILENAME
}

{	linecount++
	if(lines[$0]++ == 0)
		distinct++
}

END {	dump()
}' * > "$tmpf"

echo 'Sorted by increaasing number of lines in files:'
sort -n "$tmpf"

echo 'Sorted by increaasing number of distinct lines in files:'
sort -k2,2n "$tmpf"

Note that this should work with any version of awk (but on Solaris systems, you'll need to use nawk or /usr/xpg4/bin/awk).

Thanks Don Cragun -- this also works!


Quote:
Originally Posted by MadeInGermany
The following variant correctly handles filenames with special characters:
Code:
for f in *; do printf "%s/%s lines are unique in file %s\n" $(sort "$f" | uniq -u | wc -l) $(wc -l < "$f") "$f"; done

Post #3 has another perception of "unique":
Code:
for f in *; do printf "%s/%s unique lines in file %s\n" $(sort  -u "$f" | wc -l) $(wc -l < "$f") "$f"; done

Didn't see the "sort" requirement. Left as an exercise.
Thanks MadeInGermany, first gives same unique count as vgersh99, second works for me. Maybe my perception of unique is incorrect Smilie

I'm getting my unique count by:

Code:
sort filename | uniq | wc -l

The contents of my files are URLs if that makes a difference.

Last edited by spacegoose; 07-11-2019 at 06:13 PM..
This User Gave Thanks to spacegoose For This Post:
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

How to print number of lines with awk ?

Can some body tell me how to print number of line from a particular file, with sed. ? Input file format AAAA BBBB CCCC SDFFF DDDD DDDD Command to print line 2 and 3 ? BBBB CCCC And also please tell me how to assign column sum to variable. I user the following command it... (1 Reply)
Discussion started by: maheshsri
1 Replies

2. Shell Programming and Scripting

How do I print out lines with the same number in front using awk?

Hi, I need help in printing out the dates with the largest value in front of it using awk. 436 28/Feb/2008 436 27/Feb/2008 436 20/Feb/2008 422 13/Feb/2008 420 23/Feb/2008 409 21/Feb/2008 402 26/Feb/2008 381 22/Feb/2008 374 24/Feb/2008 360... (7 Replies)
Discussion started by: SIFA
7 Replies

3. SCO

Why? I can not change the number of lines to print

hi My problem now is that if shipping options as -o length = 88 it says the following: # lp -o length=88 -dhp4015 /etc/hosts UX:lp: ERROR: The following options can't be handled: -o length= TO FIX: The printer(s) that otherwise qualify for printing your request can't handle one or more of... (2 Replies)
Discussion started by: Edgar Guevara
2 Replies

4. Shell Programming and Scripting

print every 20 lines the lowest number

Hello all, How can I find the lowest number every 10 lines? For example i have a list name1 -0.1 name2 2 name3 3 name4 -3 name5 1 name6 2 name7 34 name8 34 (6 Replies)
Discussion started by: TheTransporter
6 Replies

5. Shell Programming and Scripting

print lines between line number

Hi, Anyone help me to print the lines from the flat file between 879th line number and 1424th line number. The 879 and 1424 should be passed as input to the shell script(It should be dynamic). Can any one give me using sed or awk? I tried using read, and print the lines..Its taking too... (3 Replies)
Discussion started by: senthil_is
3 Replies

6. Shell Programming and Scripting

Compare multiple files and print unique lines

Hi friends, I have multiple files. For now, let's say I have two of the following style cat 1.txt cat 2.txt output.txt Please note that my files are not sorted and in the output file I need another extra column that says the file from which it is coming. I have more than 100... (19 Replies)
Discussion started by: jacobs.smith
19 Replies

7. Shell Programming and Scripting

How to print lines that only have number lower than...

Hello guys, I am a beginner in Unix :wall: and was wondering if anyone could help me. I need a script that prints lines that only has Z-value lower than equals to (<=) 1.0e-02. Each column is seperated by a tab. 10009.fd Z-value = 3.62843e-03 10009.fd Z-value = 9.75489e-01... (3 Replies)
Discussion started by: narachaid
3 Replies

8. UNIX for Dummies Questions & Answers

Writing a script to print the number of lines in multiple files

Hi I have 1000 files labelled data1.txt through data1000.txt. I want to write a script that prints out the number of lines in each txt file and outputs it in the following format: Column 1: number of data file (1 through 1000) Column 2: number of lines in the text file Thanks! (2 Replies)
Discussion started by: evelibertine
2 Replies

9. Shell Programming and Scripting

How to print N number of lines before and after the grep?

Hi , My record file , need to print up to above (DATA array)(there may be n no lines ) , grep "myvalue" row now .....suggest me some options --- DATA Array--- record type xxxxx sequence type yyyyy 2 3---> data1 /dev/ --- DEVICE --- MAXIMUM_People= data_blocks= MY_value=2 xyz abc ... (0 Replies)
Discussion started by: Huvan
0 Replies

10. UNIX for Beginners Questions & Answers

Advise on how to print range of lines above and below a number?

Hi, I have attached an output file which is some kind of database file mapping. It is basically like an allocation mapping of a tablespace and its datafile/s. The output is generated by the SQL script that I found from 401 Authorization Required Excerpts of the file are as below: ... (2 Replies)
Discussion started by: newbie_01
2 Replies
All times are GMT -4. The time now is 12:16 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy