How to paste multiple files in parallel?


 
Thread Tools Search this Thread
Top Forums UNIX for Beginners Questions & Answers How to paste multiple files in parallel?
# 1  
Old 03-28-2019
How to paste multiple files in parallel?

Hi all,

I am trying to paste thousands of files together into a matrix. Each file has only 1 column and all the files have the same number of rows (~27k rows). I tried
paste * > output as well as some other for loops
but the output only contains the columns from the 1st and last files. The format of the files are as followed. It has a header which is identical to the file name:
Code:
12345
0.0
0.0
0.0

...

Please help!


Moderator's Comments:
Mod Comment Please use CODE tags as required by forum rules!

Last edited by RudiC; 03-28-2019 at 05:48 PM.. Reason: Added CODE tags.
# 2  
Old 03-28-2019
Welcome to the forum.


I can't reproduce the behaviour you describe. Please show what happens / where it fails.
# 3  
Old 03-28-2019
Thank you.
Please see below for my effort trying to merge the three .tsv but fail.

Code:
-bash-4.2$ ll
total 2692
-rwxrwx--- 1 chenx302 1000000 136602 Mar 25 16:24 235423.tsv
-rwxrwx--- 1 chenx302 1000000 136587 Mar 25 16:24 263428.tsv
-rwxrwx--- 1 chenx302 1000000 136597 Mar 25 16:25 291417.tsv
-rwxrwx--- 1 kelkay01 1000000     12 Mar 28 10:50 f1.txt
-rwxrwx--- 1 kelkay01 1000000     12 Mar 28 10:50 f2.txt
-rwxrwx--- 1 kelkay01 1000000     12 Mar 28 10:50 f3.txt
-rwxrwx--- 1 chenx302 1000000 393561 Mar 21 11:16 geneID
-rwxrwx--- 1 chenx302 1000000 409786 Mar 27 18:35 new
-bash-4.2$
-bash-4.2$ paste *tsv | head
235423  291417
0.0     0.0
0.0     0.0
0.0     0.0
0.0     0.0
0.0     0.0
0.0     0.0
0.0     0.0
0.0     0.0
0.0     0.0
 -bash-4.2$




Moderator's Comments:
Mod Comment Please use CODE tags as required by forum rules!
How to paste multiple files in parallel?-1png

Last edited by RudiC; 03-29-2019 at 04:57 AM.. Reason: Added CODE tags.
# 4  
Old 03-29-2019
Run the command again with the -x (--xtrace) option set, to see how the shell expands / interprets your command.
Run the command paste 235423.tsv 263428.tsv 291417.tsv
Run file *.tsv and post the output.
Run od -tx1c 235423.tsv (and other files) and post (a reasonable part of) the outputs.
# 5  
Old 03-29-2019
I agree with RudiC that we need to see the first few lines of your input files. From the output you have shown us, it would seem that the most crucial would be:
Code:
for file in *.tsv
do	echo "File: $file:"
	head "$file" | od -t1xc
done

Since you haven't bothered to tell us what operating system you're using, if od complains about unknown options, try od -bc instead of od -tx1c.

Before we see the output from the above commands, would anyone care to guess which of these files have DOS <CR><LF> line separators instead of UNIX line terminators? Unfortunately, even if this is the problem, I'm not seeing the output I would have expected.
This User Gave Thanks to Don Cragun For This Post:
# 6  
Old 03-29-2019
Hi Don,

Thanks for taking a look at this. Below is the output from the code:
Code:
File: 235423.tsv:
0000000  32  33  35  34  32  33  0d  0a  30  2e  30  0d  0a  30  2e  30
          2   3   5   4   2   3  \r  \n   0   .   0  \r  \n   0   .   0
0000020  0d  0a  30  2e  30  0d  0a  30  2e  30  0d  0a  30  2e  30  0d
         \r  \n   0   .   0  \r  \n   0   .   0  \r  \n   0   .   0  \r
0000040  0a  30  2e  30  0d  0a  30  2e  30  0d  0a  30  2e  30  0d  0a
         \n   0   .   0  \r  \n   0   .   0  \r  \n   0   .   0  \r  \n
0000060  30  2e  30  0d  0a
          0   .   0  \r  \n
0000065
File: 263428.tsv:
0000000  32  36  33  34  32  38  0d  0a  30  2e  30  0d  0a  30  2e  30
          2   6   3   4   2   8  \r  \n   0   .   0  \r  \n   0   .   0
0000020  0d  0a  30  2e  30  0d  0a  30  2e  30  0d  0a  30  2e  30  0d
         \r  \n   0   .   0  \r  \n   0   .   0  \r  \n   0   .   0  \r
0000040  0a  30  2e  30  0d  0a  30  2e  30  0d  0a  30  2e  30  0d  0a
         \n   0   .   0  \r  \n   0   .   0  \r  \n   0   .   0  \r  \n
0000060  30  2e  30  0d  0a
          0   .   0  \r  \n
0000065
File: 291417.tsv:
0000000  32  39  31  34  31  37  0d  0a  30  2e  30  0d  0a  30  2e  30
          2   9   1   4   1   7  \r  \n   0   .   0  \r  \n   0   .   0
0000020  0d  0a  30  2e  30  0d  0a  30  2e  30  0d  0a  30  2e  30  0d
         \r  \n   0   .   0  \r  \n   0   .   0  \r  \n   0   .   0  \r
0000040  0a  30  2e  30  0d  0a  30  2e  30  0d  0a  30  2e  30  0d  0a
         \n   0   .   0  \r  \n   0   .   0  \r  \n   0   .   0  \r  \n
0000060  30  2e  30  0d  0a
          0   .   0  \r  \n
0000065


Last edited by RudiC; 03-29-2019 at 06:59 PM..
# 7  
Old 03-29-2019
What happens if you try:
Code:
for i in *.tsv
do	tr -d '\r' < "$i" > "$i.nocr"
done
paste *.nocr | head

This User Gave Thanks to Don Cragun For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

How to copy particular files from a multiple directories and paste in a new directory?

Dear all I have a multiple directories, say for example org1, org2, org3 ..... org100 and each directory having a file namely dnaG.fasta. I need to copy all the dnaG.fasta file from each directory and paste in another directory fastconcatg. Therefore, my script has to copy dnaG.fasta file from... (5 Replies)
Discussion started by: dineshkumarsrk
5 Replies

2. UNIX for Beginners Questions & Answers

How to copy a column of multiple files and paste into new excel file (next to column)?

I have data of an excel files as given below, file1 org1_1 1 1 2.5 100 org1_2 1 2 5.5 98 org1_3 1 3 7.2 88 file2 org2_1 1 1 2.5 100 org2_2 1 2 5.5 56 org2_3 1 3 7.2 70 I have multiple excel files as above shown. I have to copy column 1, column 4 and paste into a new excel file as... (26 Replies)
Discussion started by: dineshkumarsrk
26 Replies

3. Shell Programming and Scripting

Paste columns based on common column: multiple files

Hi all, I've multiple files. In this case 5. Space separated columns. Each file has 12 columns. Each file has 300-400K lines. I want to get the output such that if a value in column 2 is present in all the files then get all the columns of that value and print it side by side. Desired output... (15 Replies)
Discussion started by: genome
15 Replies

4. Shell Programming and Scripting

Executing Multiple Queries in parallel in Shell

I have n number of SQL queries needs to executed in Shell. Result of this query need to assign in a variable. Once all the queries are executed script needs to exit. Sample Query: SQL 1: Select Count(*) from TABLE GROUP BY COL1,COL2 SQL 2: Select Count(*) from TABLE GROUP BY COL1,COL2 ... (2 Replies)
Discussion started by: Niranjancse
2 Replies

5. Shell Programming and Scripting

Paste multiple files, but only the sorted head -50

Hello, I want to merge multiple files (under hundreds folders) side by side. File name are the same but folder are different. like folder1/same_name.txt folder2/same_name.txt folder3/same_name.txt ......Normally it can be done as paste /different_path*/same_name.txt > merged_file.txtbut... (2 Replies)
Discussion started by: yifangt
2 Replies

6. Shell Programming and Scripting

Multiple child running parallel

Hi , I am using sun Solaris machine. The machine description is given below SunOS ptxsa021 5.9 Generic_118558-24 sun4u sparc SUNW,Sun-Fire-15000 I am using korn shell.the scripts is below Parent code is given below..it is the part of code #!/bin/ksh # Script Name:... (3 Replies)
Discussion started by: ashutosh2378
3 Replies

7. Shell Programming and Scripting

scp or rsync multiple files in parallel from a remote host

Hi. I'm trying to speed up an rsync command by running it in parallel. There's no real option for this other than if the files are in multiple directories (which they're not). And even then there's no way of knowing if rsync has succeeded as the process is running in the background .. and... (4 Replies)
Discussion started by: Big_Jeffrey
4 Replies

8. Shell Programming and Scripting

Calling multiple functions in parallel

Hello, I have multiple functions within a shell script. eg. function_database_backup, unix_tar_creation, etc. I would like to run these functions in parallel, as each is independent of the other. If these were simple commands, I could have probably run each of the commands in background. ... (1 Reply)
Discussion started by: neil.k
1 Replies

9. Shell Programming and Scripting

Need to run same script multiple times in parallel

Hi all, I have a requirement in which a script invokes a Java program. Lets say script ABC invokes a java program with cfg file a parameter. This script takes 10 minutes to execute . Like this ineed to run the program 10 times meaning 100 minutes if i do it sequentially. If i open... (2 Replies)
Discussion started by: rahman_riyaz
2 Replies

10. High Performance Computing

Parallel Execution on Multiple System

Hi All, I am working on a project where I need to execute set of arguments (around 500) on a Simulator. If I execute this on one linux (RedHat 8.0) machine it will approximately takes 2~3 days. Since I am having more linux machines am thinking of executing these on different machines in... (7 Replies)
Discussion started by: 123an
7 Replies
Login or Register to Ask a Question