parsing data from a big file using keys from another smaller file Post: 302511411

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Parsing the data in a file

Hi, I have file (FILE.tmp) having contents, FILE.tmp ======== filename=menudata records=0000000000037 ldbname=pinsys timestamp=2005/05/14-18:32:33 I want to parse it bring a new file which will look like, filename records ldbname timestamp...

2. Shell Programming and Scripting

Big data file - sed/grep/awk?

Morning guys. Another day another question. :rolleyes: I am knocking up a script to pull some data from a file. The problem is the file is very big (up to 1 gig in size), so this solution: for results in `grep "^\ ... works, but takes ages (we're talking minutes) to run. The data is held...

3. Shell Programming and Scripting

perl help to split big verilog file into smaller ones for each module

Hi I have a big verilog file with multiple modules. Each module begin with the code word 'module <module-name>(ports,...)' and end with the 'endmodule' keyword. Could you please suggest the best way to split each of these modules into multiple files? Thank you for the help. Example of...

4. Shell Programming and Scripting

How to cut some data from big file

How to cut data from big file my file around 30 gb I tried "head -50022172 filename > newfile.txt ,and tail -5454283 newfile.txt. It's slowy. afer that I tried sed -n '46467831,50022172p' filename > newfile.txt ,also slow Please recommend me , faster command to cut some data from...

5. Shell Programming and Scripting

Helping in parsing subset of text from a big results file

Hi All, I need some help to effectively parse out a subset of results from a big results file. Below is an example of the text file. Each block that I need to parse starts with "reading sequence file 10.codon" (next block starts with another number) and ends with **p-Value(s)**. I have given...

6. Shell Programming and Scripting

Sort a big data file

Hello, I have a big data file (160 MB) full of records with pipe(|) delimited those fields. I`m sorting the file on the first field. I'm trying to sort with "sort" command and it brings me 6 minutes. I have tried with some transformation methods in perl but it results "Out of memory". I was...

7. Shell Programming and Scripting

Segment a big file into smaller ones

Greeting to all. I have big text file that I would like to segment into many smaller files. Each file should be maximum 10 000 lines. The file is called time.txt. after the execution of the file I would like to have. time_01.txt, time_02, txt, ...,time_n.txt Can anybody help. Br.

8. Shell Programming and Scripting

parsing characters and number from a big file with brackets

I have a big file with many brackets () in it from which I need to parse number characters and numbers. Below is an example of my file 14 (((A__0:0.02,B__1:0.3)0:0.04,C__0:0.025)2:0.01),(D__0:0.00978,E__2:0.01031)1:0.00362; 15...

9. Shell Programming and Scripting

Parsing data using keys from one file

I have 2 text files where I need to parse data from file 2 using the data from file 1. Below are my sample files File 1 (tab delimited) 257 350 670 845 725 1025 767 820 ... .... .... file 2 (tab delimited) 220..450 TA AB650 ABCED 520..850 GA AB720 ABCDE 700..1100 TC AB820 ABCDE...

10. Shell Programming and Scripting

Extract data according to keys from filename mentioned in file

Hello experts, I want to join a file with files whosE names are mentioned in one of the columns of the same file. File 1 t1,a,b,file number 1 t1,a,c,file number 1 t2,c,d,file number 2 t2,c,e,file number 2 t2,c,f,file number 2 t2,c,g,file number 2 t3,e,f,file number 3 file number 1...

LEARN ABOUT DEBIAN

plan9-sort

SORT(1) 						      General Commands Manual							   SORT(1)

NAME

       sort - sort and/or merge files

SYNOPSIS

       sort [ -cmuMbdfinrwtx ] [ +pos1 [ -pos2 ] ...  ] ...  [ -k pos1 [ ,pos2 ] ] ...
	    ' [ -o output ] [ -T dir ...  ] [ option ...  ] [ file ...	]

DESCRIPTION

       Sort  sorts  lines of all the files together and writes the result on the standard output.  If no input files are named, the standard input
       is sorted.

       The default sort key is an entire line.	Default ordering is lexicographic by runes.  The ordering is affected globally	by  the  following
       options, one or more of which may appear.

       -M     Compare  as  months.  The first three non-white space characters of the field are folded to upper case and compared so that precedes
	      etc.  Invalid fields compare low to

       -b     Ignore leading white space (spaces and tabs) in field comparisons.

       -d     `Phone directory' order: only letters, accented letters, digits and white space are significant in comparisons.

       -f     Fold lower case letters onto upper case.	Accented characters are folded to their non-accented upper case form.

       -i     Ignore characters outside the ASCII range 040-0176 in non-numeric comparisons.

       -w     Like -i, but ignore only tabs and spaces.

       -n     An initial numeric string, consisting of optional white space, optional plus or minus sign, and zero or more  digits  with  optional
	      decimal point, is sorted by arithmetic value.

       -g     Numbers, like -n but with optional e-style exponents, are sorted by value.

       -r     Reverse the sense of comparisons.

       -tx    `Tab character' separating fields is x.

       The  notation  +pos1 -pos2 restricts a sort key to a field beginning at pos1 and ending just before pos2.  Pos1 and pos2 each have the form
       m.n, optionally followed by one or more of the flags Mbdfginr, where m tells a number of fields to skip from the beginning of the line  and
       n  tells  a  number of characters to skip further.  If any flags are present they override all the global ordering options for this key.  A
       missing .n means .0; a missing -pos2 means the end of the line.	Under the -tx option, fields are strings separated by x; otherwise  fields
       are  non-empty strings separated by white space.  White space before a field is part of the field, except under option -b.  A b flag may be
       attached independently to pos1 and pos2.

       The notation -k pos1[,pos2] is how POSIX sort defines fields: pos1 and pos2 have the same format but different meanings.  The value of m is
       origin 1 instead of origin 0 and a missing .n in pos2 is the end of the field.

       When  there  are multiple sort keys, later keys are compared only after all earlier keys compare equal.	Lines that otherwise compare equal
       are ordered with all bytes significant.

       These option arguments are also understood:

       -c	  Check that the single input file is sorted according to the ordering rules; give no output unless the file is out of sort.

       -m	  Merge; assume the input files are already sorted.

       -u	  Suppress all but one in each set of equal lines.  Ignored bytes and bytes outside keys do not participate in this comparison.

       -o	  The next argument is the name of an output file to use instead of the standard output.  This file may be the same as one of  the
		  inputs.

       -Tdir	  Put temporary files in dir rather than in /var/tmp.

EXAMPLES

       Print in alphabetical order all the unique spellings
	      in a list of words where capitalized words differ from uncapitalized.

       Print the users file
	      sorted by user name (the second colon-separated field).

       Print the first instance of each month in an already sorted file.
	      Options -um with just one input file make the choice of a unique representative from a set of equal lines predictable.

       grep -n '^' input | sort -t: +1f +0n | sed 's/[0-9]*://'
	      A stable sort: input lines that compare equal will come out in their original order.

FILES

       /var/tmp/sort.<pid>.<ordinal>

SOURCE

       /src/cmd/sort.c

SEE ALSO

       uniq(1), look(1)

DIAGNOSTICS

       Sort comments and exits with non-null status for various trouble conditions and for disorder discovered under option -c.

BUGS

       An  external  null character can be confused with an internally generated end-of-field character.  The result can make a sub-field not sort
       less than a longer field.

       Some of the options, e.g.  -i and -M, are hopelessly provincial.

																	   SORT(1)

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Parsing the data in a file

Discussion started by: Omkumar

2. Shell Programming and Scripting

Big data file - sed/grep/awk?

Discussion started by: dlam

3. Shell Programming and Scripting

perl help to split big verilog file into smaller ones for each module

Discussion started by: return_user

4. Shell Programming and Scripting

How to cut some data from big file

Discussion started by: almanto

5. Shell Programming and Scripting

Helping in parsing subset of text from a big results file

Discussion started by: Lucky Ali

6. Shell Programming and Scripting

Sort a big data file

Discussion started by: rubber08

7. Shell Programming and Scripting

Segment a big file into smaller ones

Discussion started by: flash80

8. Shell Programming and Scripting

parsing characters and number from a big file with brackets

Discussion started by: Lucky Ali

9. Shell Programming and Scripting

Parsing data using keys from one file

Discussion started by: Lucky Ali

10. Shell Programming and Scripting

Extract data according to keys from filename mentioned in file

Discussion started by: ritakadm

LEARN ABOUT DEBIAN

plan9-sort