Need to sort text keeping first line always first


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Need to sort text keeping first line always first
# 8  
Old 03-01-2014
Quote:
Originally Posted by Don Cragun
Try just having your first line in your input file have a <space> as the 1st character on the line and sort with the command:
Code:
sort -t '|'

Specifying a non-space field separator should make the leading space significant.

If you don't want a visible space at the start of the first line, make the first two characters in the file be a <space> character followed a <backspace> character.
Thanks for the reminder. I actually had " \bHost = xxx, topdir=ddd etc"

but without the -t the sort interprets the line, so that it saves it as Host=xxx ....

In 20 minutes, I will do the test as you suggested. The only change I have to do is test the sort parameters.

---------- Post updated at 07:23 PM ---------- Previous update was at 07:08 PM ----------

This is the top part of my file. Notice the !!^HHost= (^H=Backspace)

Code:
1to8.txt                         |20140223 0056|25d55ad283aa400af464c76d713c07ad|/home/leslie/Development/scandirFeb23/md5-c
alpha.txt                        |20140223 0056|83e065ac9ed97eca51391c20e9671373|/home/leslie/Development/scandirFeb23/md5-c
a.txt                            |20140223 0056|933222b19ff3e7ea5f65517ea1f7d57e|/home/leslie/Development/scandirFeb23/md5-c
crc32.c                          |20140223 0056|4d7a5dbb246898ff9d3ba19c0ded7f5b|/home/leslie/Development/scandirFeb23
crc32.h                          |20140223 0056|c15674694592358889120712db73be69|/home/leslie/Development/scandirFeb23
crc32.o                          |20140301 1912|10a49aede5f82d00205c1f89a8931731|/home/leslie/Development/scandirFeb23
DATE1                            |20140223 0056|e0167034133516d3ad5d61a09bae8156|/home/leslie/Development/scandirFeb23
DATE2                            |20140223 0056|e606fe0237c786174d2087090f81644a|/home/leslie/Development/scandirFeb23
daycalc.c                        |20140223 0056|1dd882b48e5c156748aba7fb38dbba51|/home/leslie/Development/scandirFeb23
dirdepth                         |20140301 1912|9f2ff1bd8b133ca0de8d124ad7d761d2|/home/leslie/Development/scandirFeb23
dirdepth.c                       |20140223 0056|a7c3f1c02245aec9a1b651e11018ff82|/home/leslie/Development/scandirFeb23
dirent.h                         |20140223 0056|1906fd554bf036fdf6ffd0b054ca321d|/home/leslie/Development/scandirFeb23
empty.txt                        |20140223 0056|d41d8cd98f00b204e9800998ecf8427e|/home/leslie/Development/scandirFeb23/md5-c
gcc.txt                          |20140223 0056|b8917c1a087abbf74f0294dad9cbf698|/home/leslie/Development/scandirFeb23
!! Host=Fedora20.Bachelor           |          |                       scan from| ^H/home/leslie/Development/scandirFeb23
inih_r27.tar                     |20140223 0056|a8da6db331c8fe638cbb8c6940ce303e|/home/leslie/Development/scandirFeb23
inih_r28Dec16.00.tar             |20140223 0056|6fe6356f0ba2e501c2713958f119d493|/home/leslie/Development/scandirFeb23
itcrftn.c                        |20140223 0056|b1f1444cfdc35b6427ad3b002a176e9f|/home/leslie/Development/scandirFeb23
log.txt                          |20140223 0056|187c3fdbde0febf71257f0c0da9e21e7|/home/leslie/Development/scandirFeb23/md5-c
makefile                         |20140223 0056|26955e927da56d1343af738a247b87e1|/home/leslie/Development/scandirFeb23/md5-c
makefile                         |20140301 1912|02c8266eb8c3d3b52eabb30378ef9895|/home/leslie/Development/scandirFeb23
md5

Is the sort program too smart. I used '!' to collate low so that my title line stayed first. I tried it with ' ' when piped to the sort as ... | sort -t '|' > x

Last edited by Scrutinizer; 03-01-2014 at 09:23 PM.. Reason: quote tags to code tags
# 9  
Old 03-01-2014
With the data shown in message #4 in this thread in a file named input.txt, I get the following data saved in the file named x from the command:
Code:
sort -t '|' -o x input.txt

Code:
!! Host=Fedora20.Bachelor           |          |                       scan from| ^H/home/leslie/Development/scandirFeb23
1to8.txt                         |20140223 0056|25d55ad283aa400af464c76d713c07ad|/home/leslie/Development/scandirFeb23/md5-c
DATE1                            |20140223 0056|e0167034133516d3ad5d61a09bae8156|/home/leslie/Development/scandirFeb23
DATE2                            |20140223 0056|e606fe0237c786174d2087090f81644a|/home/leslie/Development/scandirFeb23
a.txt                            |20140223 0056|933222b19ff3e7ea5f65517ea1f7d57e|/home/leslie/Development/scandirFeb23/md5-c
alpha.txt                        |20140223 0056|83e065ac9ed97eca51391c20e9671373|/home/leslie/Development/scandirFeb23/md5-c
crc32.c                          |20140223 0056|4d7a5dbb246898ff9d3ba19c0ded7f5b|/home/leslie/Development/scandirFeb23
crc32.h                          |20140223 0056|c15674694592358889120712db73be69|/home/leslie/Development/scandirFeb23
crc32.o                          |20140301 1912|10a49aede5f82d00205c1f89a8931731|/home/leslie/Development/scandirFeb23
daycalc.c                        |20140223 0056|1dd882b48e5c156748aba7fb38dbba51|/home/leslie/Development/scandirFeb23
dirdepth                         |20140301 1912|9f2ff1bd8b133ca0de8d124ad7d761d2|/home/leslie/Development/scandirFeb23
dirdepth.c                       |20140223 0056|a7c3f1c02245aec9a1b651e11018ff82|/home/leslie/Development/scandirFeb23
dirent.h                         |20140223 0056|1906fd554bf036fdf6ffd0b054ca321d|/home/leslie/Development/scandirFeb23
empty.txt                        |20140223 0056|d41d8cd98f00b204e9800998ecf8427e|/home/leslie/Development/scandirFeb23/md5-c
gcc.txt                          |20140223 0056|b8917c1a087abbf74f0294dad9cbf698|/home/leslie/Development/scandirFeb23
inih_r27.tar                     |20140223 0056|a8da6db331c8fe638cbb8c6940ce303e|/home/leslie/Development/scandirFeb23
inih_r28Dec16.00.tar             |20140223 0056|6fe6356f0ba2e501c2713958f119d493|/home/leslie/Development/scandirFeb23
itcrftn.c                        |20140223 0056|b1f1444cfdc35b6427ad3b002a176e9f|/home/leslie/Development/scandirFeb23
log.txt                          |20140223 0056|187c3fdbde0febf71257f0c0da9e21e7|/home/leslie/Development/scandirFeb23/md5-c
makefile                         |20140223 0056|26955e927da56d1343af738a247b87e1|/home/leslie/Development/scandirFeb23/md5-c
makefile                         |20140301 1912|02c8266eb8c3d3b52eabb30378ef9895|/home/leslie/Development/scandirFeb23
md5

The order of lines in the output is the same when the line containing Host= is:
Code:
!! Host=Fedora20.Bachelor | | scan from| ^H/home/leslie/Development/scandirFeb23
 ^HHost=Fedora20.Bachelor | | scan from| ^H/home/leslie/Development/scandirFeb23
        or
 Host=Fedora20.Bachelor | | scan from| ^H/home/leslie/Development/scandirFeb23

The system I'm using for this test is Mac OS X 10.7.5 running on a MacBook Pro laptop. This is the output I would expect for any sort utility conforming to the POSIX standards.

Note that the <space><backspace> before: /home/leslie/Development/scandirFeb23 in that line doesn't matter unless you're sorting on that field with something like:
Code:
sort -t '|' -k4 -o x input.txt

# 10  
Old 03-02-2014
gnu sort appears to behave differently

I ran sort as you indicated and it worked.

I have tried to understand why you used -k4

I was trying to sort by the first column. And for all that I tried,
the Host=Fedora .... was placed somewhere in the middle of the output file.

If I sorted on fields 2,3,4, the sort yields what I require.

If I did not specify a "-b" with the sort, it should assume the leading blanks are part of the field and should not be skipped over
.

Thank for your patience and help.

Last edited by lsatenstein; 03-02-2014 at 01:16 AM..
# 11  
Old 03-02-2014
Quote:
Originally Posted by lsatenstein
I ran sort as you indicated and it worked.
I'm glad that it worked for you.
Quote:
Originally Posted by lsatenstein
I have tried to understand why you used -k4
In the input sample you provided the header line was:
Code:
!! Host=Fedora20.Bachelor | | scan from| ^H/home/leslie/Development/scandirFeb23

What I was saying was that the <space><backspace> (marked in red above) doesn't make any difference unless you're sorting on the 4th field instead of the 1st field. I didn't understand why those characters were present in your sample input.
Quote:
Originally Posted by lsatenstein
I was trying to sort by the first column. And for all that I tried,
the Host=Fedora .... was placed somewhere in the middle of the output file.

If I sorted on fields 2,3,4, the sort yields what I require.

If I did not specify a "-b" with the sort, it should assume the leading blanks are part of the field and should not be skipped over
.

Thank for your patience and help.
Yes. You are correct. If sort on your system behaved as specified by the standards, the -t option should not be needed in this case. I suggested using the -t option because of disedorgue's comment in message #4 in this thread:
Quote:
There was some version of "gnu sort" with a bug of '-b' option was enabled by default.
Since there is an interaction between the -b and -t options (although it isn't as clearly specified in the Linux sort(1) man page as it is in the POSIX sort utility man page), I thought that if your version of sort did have this bug, using the -t option might provide a work around.
# 12  
Old 03-02-2014
I guess I was bleary eyed last night when I indicated that everything was ok, but ....
Sorting for columns 2 through 4 works fine.

How do I make the sort work for column 1?
Do I need to add a leading | symbol before column 1?

Please refer to the list I posted yesterday, a few messages back.
Is the problem me or the sort's limitations?
# 13  
Old 03-02-2014
Quote:
Originally Posted by lsatenstein
I guess I was bleary eyed last night when I indicated that everything was ok, but ....
Sorting for columns 2 through 4 works fine.

How do I make the sort work for column 1?
Do I need to add a leading | symbol before column 1?

Please refer to the list I posted yesterday, a few messages back.
Is the problem me or the sort's limitations?
I assume that you're referring to the following which is from message #8 in this thread:
Quote:
This is the top part of my file. Notice the !!^HHost= (^H=Backspace)


Code:
1to8.txt                         |20140223 0056|25d55ad283aa400af464c76d713c07ad|/home/leslie/Development/scandirFeb23/md5-c
alpha.txt                        |20140223 0056|83e065ac9ed97eca51391c20e9671373|/home/leslie/Development/scandirFeb23/md5-c
a.txt                            |20140223 0056|933222b19ff3e7ea5f65517ea1f7d57e|/home/leslie/Development/scandirFeb23/md5-c
crc32.c                          |20140223 0056|4d7a5dbb246898ff9d3ba19c0ded7f5b|/home/leslie/Development/scandirFeb23
crc32.h                          |20140223 0056|c15674694592358889120712db73be69|/home/leslie/Development/scandirFeb23
crc32.o                          |20140301 1912|10a49aede5f82d00205c1f89a8931731|/home/leslie/Development/scandirFeb23
DATE1                            |20140223 0056|e0167034133516d3ad5d61a09bae8156|/home/leslie/Development/scandirFeb23
DATE2                            |20140223 0056|e606fe0237c786174d2087090f81644a|/home/leslie/Development/scandirFeb23
daycalc.c                        |20140223 0056|1dd882b48e5c156748aba7fb38dbba51|/home/leslie/Development/scandirFeb23
dirdepth                         |20140301 1912|9f2ff1bd8b133ca0de8d124ad7d761d2|/home/leslie/Development/scandirFeb23
dirdepth.c                       |20140223 0056|a7c3f1c02245aec9a1b651e11018ff82|/home/leslie/Development/scandirFeb23
dirent.h                         |20140223 0056|1906fd554bf036fdf6ffd0b054ca321d|/home/leslie/Development/scandirFeb23
empty.txt                        |20140223 0056|d41d8cd98f00b204e9800998ecf8427e|/home/leslie/Development/scandirFeb23/md5-c
gcc.txt                          |20140223 0056|b8917c1a087abbf74f0294dad9cbf698|/home/leslie/Development/scandirFeb23
!! Host=Fedora20.Bachelor           |          |                       scan from| ^H/home/leslie/Development/scandirFeb23
inih_r27.tar                     |20140223 0056|a8da6db331c8fe638cbb8c6940ce303e|/home/leslie/Development/scandirFeb23
inih_r28Dec16.00.tar             |20140223 0056|6fe6356f0ba2e501c2713958f119d493|/home/leslie/Development/scandirFeb23
itcrftn.c                        |20140223 0056|b1f1444cfdc35b6427ad3b002a176e9f|/home/leslie/Development/scandirFeb23
log.txt                          |20140223 0056|187c3fdbde0febf71257f0c0da9e21e7|/home/leslie/Development/scandirFeb23/md5-c
makefile                         |20140223 0056|26955e927da56d1343af738a247b87e1|/home/leslie/Development/scandirFeb23/md5-c
makefile                         |20140301 1912|02c8266eb8c3d3b52eabb30378ef9895|/home/leslie/Development/scandirFeb23
md5

Note the two sections marked in red. Your description of the sample input shows !!\bHost= (where \b is the backspace character) but the data in the file shows !! Host= (with a space instead of a backspace). Note that if that file did contain a backspace before Host= instead of a space, then sorting that file using the command:
Code:
sort -fd input_file

would produce output exactly matching the input you showed us.

Please upload the exact sample input_file you're using, show us the exact command line you're using to sort that file, the exact output you're getting from that command line, and the exact output you're trying to get.

Unless you want exclamation points in the header in your output, please use a space followed by a backspace as the 1st two characters on the header line instead of two exclamation points followed by a space.
# 14  
Old 03-03-2014
Sort anomalie

Hi Don

Thank you for your posting. I too, discovered the -f option this evening and then I read your response when I came to post my finding.

The data you see above had many many iterations to try to get it to work to my requirements. My original output was produced in the printf statement beginning printf( " \bHOST=%s... ..); (one space and one backspace before the H)

From what I understood, the sort command, if issued against x.raw, the file to be sorted, with the following comand line

Code:
   sort -k1 -t '|' -o x.sorted  x.raw

should keep the first line invariant, but it does not. It appears to require the -f to almost meet my needs.


The manual states that the -f was to fold upper to lower case together to lowercase.

I really was after the ascii collating sequence. Ergo With the -f option, Date1 and Date2 are in the wrong place, but the first line is maintained as was desired.

Is it possible that the sort is missing an option to "just sort a column", purely respecting the ascii contents of the field?

If the above answer is no, then if it was up to me, I would request a -e parameter (when used with -t ). It would be used to stop interpretation of leading blanks and non-alpha characters.

In closing, thanks for your help and for the others in the forum who responded.

Last edited by Scrutinizer; 03-03-2014 at 01:17 AM.. Reason: code tags
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Sort a line and Insert sorted word(s) in a line

Hello, I am looking to automate a task - which is updating an existing access control instruction of a server and making sure that the attributes defined in the instruction is in sorted order. The instructions will be of a specific syntax. For example lets assume below listed is one of an... (6 Replies)
Discussion started by: sanjayroc
6 Replies

2. UNIX for Dummies Questions & Answers

sort by keeping the headings intact?

Hi all, I have a file with 3 columns separated by space. Each column has a heading. I want to sort according to the values in the 2nd column (ascending order). Ex. Name rank direction goory 0.05 --+ laby 0.0006 --- namy 0.31 -+- ....etc. Output should be Name rank direction laby... (3 Replies)
Discussion started by: Unilearn
3 Replies

3. Shell Programming and Scripting

Keeping the title of a text report

I am in need of keeping a title of a report and removing duplicates from a file like the one below. I will be using the `uniq –u` command for the removal of duplicate lines (let me know if there is a better way rather than the command `uniq`) but I need to keep the title (first 9 lines) of the... (2 Replies)
Discussion started by: petersf
2 Replies

4. Shell Programming and Scripting

sort text file

HI all i have a text file file1 like this 004002004545454000001 041002004545222000002 006003008751525000003 007003008751352000004 006003008751142000005 004001005745745000006 i want to sort the file according to position 1-5 and secondary sort by the last position of file 16-21... (4 Replies)
Discussion started by: naamas03
4 Replies

5. UNIX for Dummies Questions & Answers

Sort and uniq lines of a file while keeping a header line

So, I have a file that has some duplicate lines. The file has a header line that I would like to keep at the top. I could do this by extracting the header from the file, 'sort -u' the remaining lines, and recombine them. But they are quite big, so if there is a way to do it with a single... (1 Reply)
Discussion started by: Digby
1 Replies

6. Shell Programming and Scripting

Need Help to sort text lines

I need to sort input file as below to display as below: input.txt User: my_id File: oracle/scripts/ssc/ssc_db_info User: your_id File: pkg_files/BWSwsrms/request User: your_id File: pkg_files/BWSwsco/checkConfig.sh OUTPUT: User: my_id File: ... (3 Replies)
Discussion started by: tqlam
3 Replies

7. UNIX for Dummies Questions & Answers

Sort Text

Hello, I have a text file that I need to sort the lines by date record=5,French 9,2008-09-02T08:55:00,2008-09-02T10:00:00,2 record=79,Entrepreneurship 30,2008-09-17T11:00:00,2008-09-17T12:00:00,2 record=6,Computer Science 20,2008-09-02T09:55:00,2008-09-02T10:50:00,1... (5 Replies)
Discussion started by: Dallasbr
5 Replies

8. UNIX for Dummies Questions & Answers

deleting a line but keeping the same file

Hi, I want to delete a line in a file that contains a string. I tried: grep -v "mystring" Myfile > Myfile But this makes the Myfile empty. I read that I need to do something like: grep -v "mystring" Myfile > Myfile.new rm Myfile mv Myfile.new Myfile Is there a way to avoid creating a... (2 Replies)
Discussion started by: laiko
2 Replies

9. Shell Programming and Scripting

Sort a file line by line alphabetically

infile: z y x c b a desired output: x y z a b c I don't want to sort the lines into this: a b c x y z nor this: c b a z y x The number of fields per line and number of lines is indeterminate. The field separator is always a space. Thanks for the use of your collective brains.... (11 Replies)
Discussion started by: H2OBoodle
11 Replies

10. Shell Programming and Scripting

Need a Help with sort a text file with some fields

Ive got a file called listacdrs with this structure: 01/09/2006 12:13 p.m. 1.046.528 CF0155.DAT 01/09/2006 12:13 p.m. 1.046.528 CF0156.DAT 01/09/2006 12:13 p.m. 1.046.528 CF0157.DAT 01/09/2006 12:13 p.m. 1.046.528 CF0158.DAT 01/09/2006 12:14 p.m. ... (3 Replies)
Discussion started by: alexcol
3 Replies
Login or Register to Ask a Question