Combining chunks of data Post: 302685545

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Compare two csv files by two colums and create third file combining data from them.

I've got two large csv text table files with different number of columns each. I have to compare them based on first two columns and create resulting file that would in case of matched first two columns include all values from first one and all values (except first two colums) from second one. I...

2. Shell Programming and Scripting

remove chunks of text from file

All, So, I have an ldif file that contains about 6500 users worth of data. Some users have a block of text I'd like to remove, while some don't. Example (block of text in question is the block starting with "authAuthority: ;Kerberosv5"): User with text block: # username, users,...

3. Shell Programming and Scripting

Combining header and data and send email without usage of temp file

Dear All- My requirement is as below- Header file $ cat HEADER.txt RequestId: RequestDate: Data file $ cat DATAVAL.txt 1001|2009-03-01 I need to send the combined data below as email body via mailx command ------------------ RequestId:1001 RequestDate:2009-03-01 I would like...

4. Shell Programming and Scripting

Parsing chunks of text and finding data

Hi, I need a script that parses and greps data out of a textfile. I have a text file that has this structure: File1 host1.localdomain text random text Found errors this text is random (41123) --- random random at.5165 ---- random random at.5165 ---- random random at.5165 ----...

5. Shell Programming and Scripting

Combining data from file

Hi All, I have file f1 contains : f1; Server Name1 te-1212hdsfhf-12kll-56565 Server Name2 jd-1212hdsfhf-12kll-5677 Server Name3 ty-1212hdsfhf-12kll-444 .... I have to produce f2 with the output: f2: Server1 te-1212hdsfhf-12kll-56565 Server2 jd-1212hdsfhf-12kll-5677 Server3...

6. Shell Programming and Scripting

onstat -d chunks perl monitor

I have a file I need to monitor with a perl script with the following format. I need to send off a 0 if it is above 95 in the 5th colum and a 1 if it is below. Any help on a simple perl script would be great. 75424958 999975 983170 /dev/rmetrochunk00 98.32 760c2dd8 ...

7. UNIX for Dummies Questions & Answers

Need help combining txt files w/ multiple lines into csv single cell - also need data merge

:confused:Hello -- i just joined the forums. I am a complete noob -- only about 1 week into learning how to program anything... and starting with linux. I am working in Linux terminal. I have a folder with a bunch of txt files. Each file has several lines of html code. I want to combine...

8. Shell Programming and Scripting

Splitting a file into chunks of 1TB

Hi I have a file with different filesystems with there sizes. I need to split them in chucks of 1TB. The file looks like vf_MTLHQNASF07_Wkgp2 187428400 10601AW1 vf_MTLHQNASF07_Wkgp2 479504596 10604AW1 vf_MTLHQNASF07_Wkgp2 19940 10605AID vf_MTLHQNASF07_Wkgp2 1242622044...

9. Red Hat

Free() corrupted unsorted chunks

We are migrating Pro*C code from SOLARIS to LINUX-Redhat. While migrating we face memory de-allocation issue intermittently when accessing large volume of data. Below is the part of the code(since code is big I am putting the part of the code where the issue comes): ...

10. Shell Programming and Scripting

Move files from one directory to another in chunks

All, I have an application that is not working properly and the company is 'in the process' of fixing it. In the meantime, I want to write a bash script work-around. However, what I thought was going to be simple is seemingly not. Need: - Move files from one directory to another in...

LEARN ABOUT CENTOS

funjoin

funjoin(1)							SAORD Documentation							funjoin(1)

NAME

       funjoin - join two or more FITS binary tables on specified columns

SYNOPSIS

       funjoin [switches] <ifile1> <ifile2> ... <ifilen> <ofile>

OPTIONS

	 -a  cols	      # columns to activate in all files
	 -a1 cols ... an cols # columns to activate in each file
	 -b  'c1:bvl,c2:bv2'  # blank values for common columns in all files
	 -bn 'c1:bv1,c2:bv2'  # blank values for columns in specific files
	 -j  col	      # column to join in all files
	 -j1 col ... jn col   # column to join in each file
	 -m min 	      # min matches to output a row
	 -M max 	      # max matches to output a row
	 -s		      # add 'jfiles' status column
	 -S col 	      # add col as status column
	 -t tol 	      # tolerance for joining numeric cols [2 files only]

DESCRIPTION

       funjoin joins rows from two or more (up to 32) FITS Binary Table files, based on the values of specified join columns in each file. NB: the
       join columns must have an index file associated with it. These files are generated using the funindex program.

       The first argument to the program specifies the first input FITS table or raw event file. If "stdin" is specified, data are read from the
       standard input.	Subsequent arguments specify additional event files and tables to join.  The last argument is the output FITS file.

       NB: Do not use Funtools Bracket Notation to specify FITS extensions and row filters when running funjoin or you will get wrong results.
       Rows are accessed and joined using the index files directly, and this bypasses all filtering.

       The join columns are specified using the -j col switch (which specifies a column name to use for all files) or with -j1 col1, -j2 col2,
       ... -jn coln switches (which specify a column name to use for each file). A join column must be specified for each file.  If both -j col
       and -jn coln are specified for a given file, then the latter is used. Join columns must either be of type string or type numeric; it is
       illegal to mix numeric and string columns in a given join.  For example, to join three files using the same key column for each file, use:

	 funjoin -j key in1.fits in2.fits in3.fits out.fits

       A different key can be specified for the third file in this way:

	 funjoin -j key -j3 otherkey in1.fits in2.fits in3.fits out.fits

       The -a "cols" switch (and -a1 "col1", -a2 "cols2" counterparts) can be used to specify columns to activate (i.e. write to the output
       file) for each input file. By default, all columns are output.

       If two or more columns from separate files have the same name, the second (and subsequent) columns are renamed to have an underscore and a
       numeric value appended.

       The -m min and -M max switches specify the minimum and maximum number of joins required to write out a row. The default minimum is 0
       joins (i.e. all rows are written out) and the default maximum is 63 (the maximum number of possible joins with a limit of 32 input files).
       For example, to write out only those rows in which exactly two files have columns that match (i.e. one join):

	 funjoin -j key -m 1 -M 1 in1.fits in2.fits in3.fits ... out.fits

       A given row can have the requisite number of joins without all of the files being joined (e.g. three files are being joined but only two
       have a given join key value). In this case, all of the columns of the non-joined file are written out, by default, using blanks (zeros or
       NULLs).	The -b c1:bv1,c2:bv2 and -b1 'c1:bv1,c2:bv2' -b2 'c1:bv1,c2 - bv2' ...  switches can be used to set the blank value for columns
       common to all files and/or columns in a specified file, respectively. Each blank value string contains a comma-separated list of col-
       umn:blank_val specifiers.  For floating point values (single or double), a case-insensitive string value of "nan" means that the IEEE NaN
       (not-a-number) should be used. Thus, for example:

	 funjoin -b "AKEY:???" -b1 "A:-1" -b3 "G:NaN,E:-1,F:-100" ...

       means that a non-joined AKEY column in any file will contain the string "???", the non-joined A column of file 1 will contain a value of
       -1, the non-joined G column of file 3 will contain IEEE NaNs, while the non-joined E and F columns of the same file will contain values
       -1 and -100, respectively. Of course, where common and specific blank values are specified for the same column, the specific blank value
       is used.

       To distinguish which files are non-blank components of a given row, the -s (status) switch can be used to add a bitmask column named
       "JFILES" to the output file. In this column, a bit is set for each non-blank file composing the given row, with bit 0 corresponds to the
       first file, bit 1 to the second file, and so on. The file names themselves are stored in the FITS header as parameters named JFILE1,
       JFILE2, etc.  The -S col switch allows you to change the name of the status column from the default "JFILES".

       A join between rows is the Cartesian product of all rows in one file having a given join column value with all rows in a second file having
       the same value for its join column and so on. Thus, if file1 has 2 rows with join column value 100, file2 has 3 rows with the same value,
       and file3 has 4 rows, then the join results in 2*3*4=24 rows being output.

       The join algorithm directly processes the index file associated with the join column of each file. The smallest value of all the current
       columns is selected as a base, and this value is used to join equal-valued columns in the other files. In this way, the index files are
       traversed exactly once.

       The -t tol switch specifies a tolerance value for numeric columns.  At present, a tolerance value can join only two files at a time.  (A
       completely different algorithm is required to join more than two files using a tolerance, somethng we might consider implementing in the
       future.)

       The following example shows many of the features of funjoin. The input files t1.fits, t2.fits, and t3.fits contain the following columns:

	 [sh] fundisp t1.fits
	       AKEY    KEY	A      B
	----------- ------ ------ ------
		aaa	 0	0      1
		bbb	 1	3      4
		ccc	 2	6      7
		ddd	 3	9     10
		eee	 4     12     13
		fff	 5     15     16
		ggg	 6     18     19
		hhh	 7     21     22

       fundisp t2.fits
	       AKEY    KEY	C      D
	----------- ------ ------ ------
		iii	 8     24     25
		ggg	 6     18     19
		eee	 4     12     13
		ccc	 2	6      7
		aaa	 0	0      1

       fundisp t3.fits
	       AKEY    KEY	  E	   F	       G ------------ ------ -------- -------- -----------
		ggg	 6	 18	  19	  100.10
		jjj	 9	 27	  28	  200.20
		aaa	 0	  0	   1	  300.30
		ddd	 3	  9	  10	  400.40

       Given these input files, the following funjoin command:

	 funjoin -s -a1 "-B" -a2 "-D" -a3 "-E" -b 
	 "AKEY:???" -b1 "AKEY:XXX,A:255" -b3 "G:NaN,E:-1,F:-100" 
	 -j key t1.fits t2.fits t3.fits foo.fits

       will join the files on the KEY column, outputting all columns except B (in t1.fits), D (in t2.fits) and E (in t3.fits), and setting blank
       values for AKEY (globally, but overridden for t1.fits) and A (in file 1) and G, E, and F (in file 3).  A JFILES column will be output to
       flag which files were used in each row:

	       AKEY    KEY	A	AKEY_2	KEY_2	   C	   AKEY_3  KEY_3	F	    G	JFILES
	 ------------ ------ ------ ------------ ------ ------ ------------ ------ -------- ----------- --------
		aaa	 0	0	   aaa	    0	   0	      aaa      0	1      300.30	     7
		bbb	 1	3	   ???	    0	   0	      ???      0     -100	  nan	     1
		ccc	 2	6	   ccc	    2	   6	      ???      0     -100	  nan	     3
		ddd	 3	9	   ???	    0	   0	      ddd      3       10      400.40	     5
		eee	 4     12	   eee	    4	  12	      ???      0     -100	  nan	     3
		fff	 5     15	   ???	    0	   0	      ???      0     -100	  nan	     1
		ggg	 6     18	   ggg	    6	  18	      ggg      6       19      100.10	     7
		hhh	 7     21	   ???	    0	   0	      ???      0     -100	  nan	     1
		XXX	 0    255	   iii	    8	  24	      ???      0     -100	  nan	     2
		XXX	 0    255	   ???	    0	   0	      jjj      9       28      200.20	     4

SEE ALSO

       See funtools(7) for a list of Funtools help pages

version 1.4.2							  January 2, 2008							funjoin(1)

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Compare two csv files by two colums and create third file combining data from them.

Discussion started by: agb2008

2. Shell Programming and Scripting

remove chunks of text from file

Discussion started by: staze

3. Shell Programming and Scripting

Combining header and data and send email without usage of temp file

Discussion started by: sureshg_sampat

4. Shell Programming and Scripting

Parsing chunks of text and finding data

Discussion started by: erick_tuk