Sponsored Content
Top Forums Shell Programming and Scripting missing comma delimeter in columns Post 302602427 by awais290 on Monday 27th of February 2012 10:35:32 AM
Old 02-27-2012
thanks a lot for all your replies. i dont know how many columns will be in each file. just for example i mention 4 columns. i got the logic i will try using the above logic.

here is the code which i was trying for the data. now i have to check only for columns. how can i modified the same script to just for columns. here its checking each line and i have to check the line which contain columns of the file.
Code:
#!/bin/ksh
BASE_DIR=/data/SrcFilescd 
$BASE_DIR
## finding the files from work directory which are changed in 3 day
Find . -type f -name "*.csv" -ctime 3 > /home/mydir/flist.txt
## Loop thru all the file nameswhile read linedo
## get only the base name for the file
FN=`basename $line`
## the variable DC counts the number delimiters on each line and sort them and get the unique
## for good file without any delimiter missing the count should be one
DC=`awk -F "," '{print NF}' $FN | sort | uniq -c | wc -l `
## From the above we know that the good file always have DC equal one..
if [ $DC -ne 1 ]; then
echo $DC
echo $FN >> /home/mydir/badfile.txt
## also remove the bad files that are been corrupted here by removing comments ## rm $FN
else
echo $DC
echo $FN >> /home/mydir/gfile.txt
fi done < /home/mydir/flist.txt


Last edited by Scott; 02-27-2012 at 03:41 PM.. Reason: Please use code tags and less formatting
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

fill in missing columns

It can't be that hard, but I just can't figure it out: I have file like: File Sub-brick M_1 S_1 M_2 S_2 M_4 S_4 ... xxx 1 214 731 228 621 132 578 ... and would like to get 0 0 where M_3 S_3 is missing xxx 1 214 731 228 621 0 0 132 578 ... I wrote following script, but can't figure out... (3 Replies)
Discussion started by: avvk
3 Replies

2. Shell Programming and Scripting

Filling in missing columns

Hi all, I have a file that contains about 1000 rows and 800 columns. Nearly every row has 800 columns but some DONT. I want to extend the rows that dont have values with NA's. Here is an example: my file bob 2 4 5 6 8 9 4 5 tar 2 4 5 4 3 2 9 1 bro 3 5 3 4 yar 2 ... (7 Replies)
Discussion started by: gisele_l
7 Replies

3. Shell Programming and Scripting

Count the delimeter from a file and delete the row if delimeter count doesnt match.

I have a file containing about 5 million rows, in the file there are some records which has extra delimiter at random position. (we dont know the positions), now we have to Count the delimeter from each row and if the count of delimeter is not matching then I want to delete those rows from the... (5 Replies)
Discussion started by: Akumar1
5 Replies

4. UNIX for Dummies Questions & Answers

sort comma separated lines by specific columns

Hello, I have a file which lines' words are comma separated: aa, bb, cc, uu b, ee, ff bb, cc, zz, ee, ss, kk oo, bb, hh, uu a, xx, ww tt, aa, dd, yy aa, gg I want to sort first by second column and in case of tie by fourth column with sort command. So the output would be: ... (4 Replies)
Discussion started by: asanchez
4 Replies

5. Shell Programming and Scripting

Needed value after the last delimeter in a file with varying number of delimited columns

Hi All, My file has the records as below: aaa\bbb\c\dd\ee\ff\gg zz\vv\ww pp\oo\ii\uu How can I get the value after the last delimeter. My o/p: gg ww uu Thanks in Advance, (5 Replies)
Discussion started by: HemaV
5 Replies

6. Shell Programming and Scripting

Replacing the delimeter with other delimeter

Hi Friends, I have a file1.txt as below 29123973Ç2012-0529Ç35310124Ç000000000004762Ç00010Ç20Ç390ÇÇÇÇF 29123974Ç20120529Ç35310125Ç0000000000046770Ç00010Ç20Ç390ÇÇÇÇF 29123975Ç20120529Ç35310126Ç0000000000046804Ç00010Ç20Ç390ÇÇÇÇF 29123976Ç20120529Ç35310127Ç0000000000044820Ç00010Ç20Ç390ÇÇÇÇF i have a file2.txt... (4 Replies)
Discussion started by: i150371485
4 Replies

7. Shell Programming and Scripting

Transpose comma delimited data in rows to columns

Hello, I have a bilingual database with the following structure a,b,c=d,e,f The right half is in a Left to right script and the second is in a Right to left script as the examples below show What I need is to separate out the database such that the first word on the left hand matches the first... (4 Replies)
Discussion started by: gimley
4 Replies

8. Shell Programming and Scripting

Modify comma delimited file columns.

Please help me to update a file which contains date values as below:- From:- "1912108",20161130,"2016-12-01-00.00.00.000000","2016-12-01-08.37.12.000000" "1912108",20161201,"2016-12-02-00.00.00.000000","2016-12-02-08.28.22.000000" To:- "1912108",2016-11-30,"2016-12-01... (7 Replies)
Discussion started by: KrishnaVM
7 Replies

9. Shell Programming and Scripting

Awkscript to reduce words delimited with comma on right hand to columns

I have a large database with the following structure: Indicword,Indicword,Indicword=English on a line. Not all lines will have this structure. Some might have a single word mapping to a single word in Indic. An example will make this clear ... (4 Replies)
Discussion started by: gimley
4 Replies

10. Shell Programming and Scripting

How to change comma delimeter in the file to number?

I have a file H,20180624200732,VPAS,TRANS_HDR,20180724, VPAS.TRANS_HDR.20180724.01.txt, ,93, T,1, I have to change and instead first comma put ",1" like below H,20180624200732,VPAS,TRANS_HDR,20180724, VPAS.TRANS_HDR.20180724.01.txt,1,93, T,1, I made sed "2s/, /,1/"... (8 Replies)
Discussion started by: digioleg54
8 Replies
funjoin(1)							SAORD Documentation							funjoin(1)

NAME
funjoin - join two or more FITS binary tables on specified columns SYNOPSIS
funjoin [switches] <ifile1> <ifile2> ... <ifilen> <ofile> OPTIONS
-a cols # columns to activate in all files -a1 cols ... an cols # columns to activate in each file -b 'c1:bvl,c2:bv2' # blank values for common columns in all files -bn 'c1:bv1,c2:bv2' # blank values for columns in specific files -j col # column to join in all files -j1 col ... jn col # column to join in each file -m min # min matches to output a row -M max # max matches to output a row -s # add 'jfiles' status column -S col # add col as status column -t tol # tolerance for joining numeric cols [2 files only] DESCRIPTION
funjoin joins rows from two or more (up to 32) FITS Binary Table files, based on the values of specified join columns in each file. NB: the join columns must have an index file associated with it. These files are generated using the funindex program. The first argument to the program specifies the first input FITS table or raw event file. If "stdin" is specified, data are read from the standard input. Subsequent arguments specify additional event files and tables to join. The last argument is the output FITS file. NB: Do not use Funtools Bracket Notation to specify FITS extensions and row filters when running funjoin or you will get wrong results. Rows are accessed and joined using the index files directly, and this bypasses all filtering. The join columns are specified using the -j col switch (which specifies a column name to use for all files) or with -j1 col1, -j2 col2, ... -jn coln switches (which specify a column name to use for each file). A join column must be specified for each file. If both -j col and -jn coln are specified for a given file, then the latter is used. Join columns must either be of type string or type numeric; it is illegal to mix numeric and string columns in a given join. For example, to join three files using the same key column for each file, use: funjoin -j key in1.fits in2.fits in3.fits out.fits A different key can be specified for the third file in this way: funjoin -j key -j3 otherkey in1.fits in2.fits in3.fits out.fits The -a "cols" switch (and -a1 "col1", -a2 "cols2" counterparts) can be used to specify columns to activate (i.e. write to the output file) for each input file. By default, all columns are output. If two or more columns from separate files have the same name, the second (and subsequent) columns are renamed to have an underscore and a numeric value appended. The -m min and -M max switches specify the minimum and maximum number of joins required to write out a row. The default minimum is 0 joins (i.e. all rows are written out) and the default maximum is 63 (the maximum number of possible joins with a limit of 32 input files). For example, to write out only those rows in which exactly two files have columns that match (i.e. one join): funjoin -j key -m 1 -M 1 in1.fits in2.fits in3.fits ... out.fits A given row can have the requisite number of joins without all of the files being joined (e.g. three files are being joined but only two have a given join key value). In this case, all of the columns of the non-joined file are written out, by default, using blanks (zeros or NULLs). The -b c1:bv1,c2:bv2 and -b1 'c1:bv1,c2:bv2' -b2 'c1:bv1,c2 - bv2' ... switches can be used to set the blank value for columns common to all files and/or columns in a specified file, respectively. Each blank value string contains a comma-separated list of col- umn:blank_val specifiers. For floating point values (single or double), a case-insensitive string value of "nan" means that the IEEE NaN (not-a-number) should be used. Thus, for example: funjoin -b "AKEY:???" -b1 "A:-1" -b3 "G:NaN,E:-1,F:-100" ... means that a non-joined AKEY column in any file will contain the string "???", the non-joined A column of file 1 will contain a value of -1, the non-joined G column of file 3 will contain IEEE NaNs, while the non-joined E and F columns of the same file will contain values -1 and -100, respectively. Of course, where common and specific blank values are specified for the same column, the specific blank value is used. To distinguish which files are non-blank components of a given row, the -s (status) switch can be used to add a bitmask column named "JFILES" to the output file. In this column, a bit is set for each non-blank file composing the given row, with bit 0 corresponds to the first file, bit 1 to the second file, and so on. The file names themselves are stored in the FITS header as parameters named JFILE1, JFILE2, etc. The -S col switch allows you to change the name of the status column from the default "JFILES". A join between rows is the Cartesian product of all rows in one file having a given join column value with all rows in a second file having the same value for its join column and so on. Thus, if file1 has 2 rows with join column value 100, file2 has 3 rows with the same value, and file3 has 4 rows, then the join results in 2*3*4=24 rows being output. The join algorithm directly processes the index file associated with the join column of each file. The smallest value of all the current columns is selected as a base, and this value is used to join equal-valued columns in the other files. In this way, the index files are traversed exactly once. The -t tol switch specifies a tolerance value for numeric columns. At present, a tolerance value can join only two files at a time. (A completely different algorithm is required to join more than two files using a tolerance, somethng we might consider implementing in the future.) The following example shows many of the features of funjoin. The input files t1.fits, t2.fits, and t3.fits contain the following columns: [sh] fundisp t1.fits AKEY KEY A B ----------- ------ ------ ------ aaa 0 0 1 bbb 1 3 4 ccc 2 6 7 ddd 3 9 10 eee 4 12 13 fff 5 15 16 ggg 6 18 19 hhh 7 21 22 fundisp t2.fits AKEY KEY C D ----------- ------ ------ ------ iii 8 24 25 ggg 6 18 19 eee 4 12 13 ccc 2 6 7 aaa 0 0 1 fundisp t3.fits AKEY KEY E F G ------------ ------ -------- -------- ----------- ggg 6 18 19 100.10 jjj 9 27 28 200.20 aaa 0 0 1 300.30 ddd 3 9 10 400.40 Given these input files, the following funjoin command: funjoin -s -a1 "-B" -a2 "-D" -a3 "-E" -b "AKEY:???" -b1 "AKEY:XXX,A:255" -b3 "G:NaN,E:-1,F:-100" -j key t1.fits t2.fits t3.fits foo.fits will join the files on the KEY column, outputting all columns except B (in t1.fits), D (in t2.fits) and E (in t3.fits), and setting blank values for AKEY (globally, but overridden for t1.fits) and A (in file 1) and G, E, and F (in file 3). A JFILES column will be output to flag which files were used in each row: AKEY KEY A AKEY_2 KEY_2 C AKEY_3 KEY_3 F G JFILES ------------ ------ ------ ------------ ------ ------ ------------ ------ -------- ----------- -------- aaa 0 0 aaa 0 0 aaa 0 1 300.30 7 bbb 1 3 ??? 0 0 ??? 0 -100 nan 1 ccc 2 6 ccc 2 6 ??? 0 -100 nan 3 ddd 3 9 ??? 0 0 ddd 3 10 400.40 5 eee 4 12 eee 4 12 ??? 0 -100 nan 3 fff 5 15 ??? 0 0 ??? 0 -100 nan 1 ggg 6 18 ggg 6 18 ggg 6 19 100.10 7 hhh 7 21 ??? 0 0 ??? 0 -100 nan 1 XXX 0 255 iii 8 24 ??? 0 -100 nan 2 XXX 0 255 ??? 0 0 jjj 9 28 200.20 4 SEE ALSO
See funtools(7) for a list of Funtools help pages version 1.4.2 January 2, 2008 funjoin(1)
All times are GMT -4. The time now is 09:45 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy