Sponsored Content
Top Forums Shell Programming and Scripting Random lines selection form a file. Post 302203884 by Perderabo on Tuesday 10th of June 2008 06:00:14 AM
Old 06-10-2008
Prepend a random number to each line, sort the file, take the first few lines, and remove the leading random number.

Code:
 awk 'BEGIN {srand()} {printf "%05.0f %s \n",rand()*99999, $0; }' datafile | sort -n | head -100 | sed 's/^[0-9]* //'

This User Gave Thanks to Perderabo For This Post:
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

How to extract visually blank lines form the file

Hi, Could some one help me to get rid of visually blank lines from a file using shell or awk or sed (on Solaris machine)? When I use grep grep -v ^$ inputfile >outputfile it removes some blank lines.. but it seems some tab plus space balnk lines remains. thaen I used "grep -v '^]*$' ... (1 Reply)
Discussion started by: hadsuresh
1 Replies

2. Shell Programming and Scripting

PERL: Extract random record which has 4 lines each

Hi, I have a data file with millions of record (N). Each record was saved in 4 lines. So there are total of NX4 lines in the data file. For Example: Host1 a b c d Host2 e f g h Host3 i j k (2 Replies)
Discussion started by: phoeberunner
2 Replies

3. Shell Programming and Scripting

delete lines form file

Hi i am writing a cron job. in this script i need to delete some line which is match with some pattern. following code i written for deletion sed '1,'$Max_LIneNo' d' myfile.txt >tempfile.tmp mv tempfile.tmp myfile.txt this command is working fine but the problem is that after this... (5 Replies)
Discussion started by: Himanshu_soni
5 Replies

4. Shell Programming and Scripting

Random File Selection and Moving

OK, I am stumpped. I have this shell Script that I want to randomly select a file with the extention of .sct. Then using a portion of its file name select the six related .mot files. Then move them all to another folder. I also need a user input form for the number of .SCT files to randomly select... (6 Replies)
Discussion started by: stak1993
6 Replies

5. Shell Programming and Scripting

Parse large file on line count (random lines)

I have a file that needs to be parsed into multiple files every time there line contains a number 1. the problem i face is the lines are random and the file size is random. an example is that on line 4, 65, 187, 202 & 209 are number 1's so there has to be file breaks between all those to create 4... (6 Replies)
Discussion started by: darbs121
6 Replies

6. Shell Programming and Scripting

Remove x lines form top and y lines form bottom using AWK?

How to remove x lines form top and y lines form bottom. This works, but like awk only cat file | head -n-y | awk 'NR>(x-1)' so remove last 3 lines and 5 firstcat file | head -n-3 | awk 'NR>4' (5 Replies)
Discussion started by: Jotne
5 Replies

7. UNIX for Dummies Questions & Answers

Random selection of subset of sample from file

Hello Could you please help me to find a code that can randomly select 1224 lines from a file of 12240 and make tn output with 1224 line each. my input is txt file with 12240 lines like : 13474 999003507 0 0 2 -9 13475 999003508 0 0 2 -9 13476 999003509 0 0 1 -9 13477 999003510 0 0 1 -9 ... (7 Replies)
Discussion started by: biopsy
7 Replies

8. Shell Programming and Scripting

Need to remove a selection of rows separated by blank lines

hello, here is an example: 9.07 9.05 0.00 2.28 0.00 0.08 1.93 3.62 10.97 12.03 12.03 0.00 2.73 0.00 0.07 (3 Replies)
Discussion started by: Baron1
3 Replies

9. Shell Programming and Scripting

Random shuffle of lines of a TXT file

Hello friends, I have a TXT file with 300 lines in it. I need to shuffle all the lines (randomly) so that they get into different order. Can anyone pls provide easy way, if any? I got it done by doing this below but I see it very lengthy/inefficient way. call random function to generate... (2 Replies)
Discussion started by: prvnrk
2 Replies

10. Shell Programming and Scripting

Need to generate a file with random data. /dev/[u]random doesn't exist.

Need to use dd to generate a large file from a sample file of random data. This is because I don't have /dev/urandom. I create a named pipe then: dd if=mynamed.fifo do=myfile.fifo bs=1024 count=1024 but when I cat a file to the fifo that's 1024 random bytes: cat randomfile.txt >... (7 Replies)
Discussion started by: Devyn
7 Replies
TABLIFY(1p)						User Contributed Perl Documentation					       TABLIFY(1p)

NAME
tablify - turn a delimited text file into a text table SYNOPSIS
tablify [options] file Options: -h|--help Show help --no-headers Assume first line is data, not headers --no-pager Do not use $ENV{'PAGER'} even if defined --strip-quotes Strip " or ' around fields -l|--list List the fields in the file (for use with -f) -f|--fields=f1[,f2] Show only fields in comma-separated list; when used in conjunction with "no-headers" the list should be field numbers (starting at 1); otherwise, should be field names -w|where=f<cmp>v Apply the "cmp" Perl operator to restrict output where field "f" matches the value "v"; acceptable operators include ==, eq, >, >=, <=, and =~ -v|--vertical Show records vertically --limit=n Limit to first "n" records --fs=x Use "x" as the field separator (default is tab " ") --rs=x Use "x" as the record separator (default is newline " ") --as-html Create an HTML table instead of plain text DESCRIPTION
This script is essentially a quick way to parse a delimited text file and view it as a nice ASCII table. By selecting only certain fields, employing a where clause to only select records where a field matches some value, and using the limit to only see some of the output, you almost have a mini-database front-end for a simple text file. EXAMPLES
Given a data file like this: name,rank,serial_no,is_living,age George,General,190293,0,64 Dwight,General,908348,0,75 Attila,Hun,,0,56 Tojo,Emporor,,0,87 Tommy,General,998110,1,54 To find the fields you can reference, use the list option: $ tablify --fs ',' -l people.dat +-----------+-----------+ | Field No. | Field | +-----------+-----------+ | 1 | name | | 2 | rank | | 3 | serial_no | | 4 | is_living | | 5 | age | +-----------+-----------+ To extract just the name and serial numbers, use the fields option: $ tablify --fs ',' -f name,serial_no people.dat +--------+-----------+ | name | serial_no | +--------+-----------+ | George | 190293 | | Dwight | 908348 | | Attila | | | Tojo | | | Tommy | 998110 | +--------+-----------+ 5 records returned To extract the first through third fields and the fifth field (where field numbers start at "1" -- tip: use the list option to quickly determine field numbers), use this syntax for fields: $ tablify --fs ',' -f 1-3,5 people.dat +--------+---------+-----------+------+ | name | rank | serial_no | age | +--------+---------+-----------+------+ | George | General | 190293 | 64 | | Dwight | General | 908348 | 75 | | Attila | Hun | | 56 | | Tojo | Emporor | | 87 | | Tommy | General | 998110 | 54 | +--------+---------+-----------+------+ 5 records returned To select only the ones with six serial numbers, use a where clause: $ tablify --fs ',' -w 'serial_no=~/^d{6}$/' people.dat +--------+---------+-----------+-----------+------+ | name | rank | serial_no | is_living | age | +--------+---------+-----------+-----------+------+ | George | General | 190293 | 0 | 64 | | Dwight | General | 908348 | 0 | 75 | | Tommy | General | 998110 | 1 | 54 | +--------+---------+-----------+-----------+------+ 3 records returned To find Dwight's record, you would do this: $ tablify --fs ',' -w 'name eq "Dwight"' people.dat +--------+---------+-----------+-----------+------+ | name | rank | serial_no | is_living | age | +--------+---------+-----------+-----------+------+ | Dwight | General | 908348 | 0 | 75 | +--------+---------+-----------+-----------+------+ 1 record returned To find the name of all the people with a serial number who are living: $ tablify --fs ',' -f name -w 'is_living==1' -w 'serial_no>0' people.dat +-------+ | name | +-------+ | Tommy | +-------+ 1 record returned To filter outside of program and simply format the results, use "-" as the last argument to force reading of STDIN (and probably assume no headers): $ grep General people.dat | tablify --fs ',' -f 1-3 --no-headers - +---------+--------+--------+ | Field1 | Field2 | Field3 | +---------+--------+--------+ | General | 190293 | 0 | | General | 908348 | 0 | | General | 998110 | 1 | +---------+--------+--------+ 3 records returned When dealing with data lacking field names, you can specify "no-headers" and then refer to fields by number (starting at one), e.g.: $ tail -5 people.dat | tablify --fs ',' --no-headers -w '3 eq "General"' - +--------+---------+--------+--------+--------+ | Field1 | Field2 | Field3 | Field4 | Field5 | +--------+---------+--------+--------+--------+ | George | General | 190293 | 0 | 64 | | Dwight | General | 908348 | 0 | 75 | | Tommy | General | 998110 | 1 | 54 | +--------+---------+--------+--------+--------+ 3 records returned If your file has many fields which are hard to see across the screen, consider using the vertical display with "-v" or "--vertical", e.g.: $ tablify --fs ',' -v --limit 1 people.dat ************ Record 1 ************ name: George rank: General serial_no: 190293 is_living: 0 age : 64 1 record returned SEE ALSO
o Text::RecordParser o Text::TabularDisplay o DBD::CSV Although I don't DBD::CSV this module, the idea was much the inspiration for this. I just didn't want to have to install DBI and DBD::CSV to get this kind of functionality. I think my interface is simpler. AUTHOR
Ken Youens-Clark <kclark@cpan.org>. LICENSE AND COPYRIGHT
Copyright (C) 2006-10 Ken Youens-Clark. All rights reserved. This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; version 2. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. perl v5.10.1 2010-07-26 TABLIFY(1p)
All times are GMT -4. The time now is 06:52 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy