Extracting values based on line-column numbers from multiple text files


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Extracting values based on line-column numbers from multiple text files
# 1  
Old 11-21-2015
Extracting values based on line-column numbers from multiple text files

Dear All,
I have to solve the following problems with multiple tab-separated text file but I don't know how. Any help would be greatly appreciated. I have access to Linux mint (but not as a professional).
I have multiple tab-delimited files with the following structure:
file1:
Code:
1    44
2    56  46
3    90  56  43
4    32  32  23  49
5    43  56  45  34  48
6    35  32  56  41  42  43
7    38  32  23  12  56  32  67
8    78  32  45  45  12  45  45  45
9    23  45  45  34  43  34  12  23  234
10   45  62  56  44  77  54  33  78  890  432
11   12 43   21  34  54  43  56  54  543  546  65   
.
.

I have to do the following 2 task with these files:
Task1:
I want to pick up a specific value (based on line and column number) from each file and put it to another file (say Task1.txt). In Task1.txt I have defined that for each file I want the value in what line and what column and the value should be added to the next column of this file.
Task1.txt
Code:
FileName      lineNumber         ColumnNumber     value  
File1               5                  4           45
File2               7                   6            ?
File3               10                  9            ?
File4               67                  55           ?

Task2
For the same files I want to pick up all values of the second column for a range of lines and put them into the Task2.txt file. Here is the Task2.txt file:
Task2.txt
Code:
FileName   Line-start   line-end
File1          4             8         32     43     35     38     78
File2          67           80
File3          345          380
File4          48            54

It will be appreciated if you kindly help me to solve these issues.
Regards
# 2  
Old 11-21-2015
Any ideas/thoughts/attempts from your side?

---------- Post updated at 13:12 ---------- Previous update was at 13:10 ----------

Is there always just one single line for one single file in task1 / task2?
This User Gave Thanks to RudiC For This Post:
# 3  
Old 11-21-2015
Thank you for your reply. Actually I am a sort of newbie to Linux.
Yes, in Task 1 and Task2 files there is always a single line for each file.
Regards
# 4  
Old 11-21-2015
first approximation:
Code:
awk '
FNR == 1        {FC++
                 FN = FILENAME
                 if (FC < 3)    {OUT[FC] =  FN ".new"
                                 print > OUT[FC]
                                 next
                                }
                 printf "%s", LIN[FN,1] > OUT[1]
                 printf "%s", LIN[FN,2] > OUT[2]
                }
FC < 3          {LIN[$1,FC] = $0
                 BEG[$1,FC] = $2
                 STP[$1,FC] = $3
                 next
                }

FNR == BEG[FN,1]        {printf "\t%s\n", $STP[FN,1] > OUT[1]   
                        }
FNR == BEG[FN,2], \
FNR == STP[FN,2]        {printf "\t%s", $2 > OUT[2]
                        }
FNR == STP[FN,2]        {printf "\n"  > OUT[2]
                        }
'  task[12] file1
task1.new:
FileName    lineNumber    ColumnNumber    value    
file1    5    4    45
task2.new:
FileName    Line-start    line-end
file1    4    8    32    43    35    38    78

These 2 Users Gave Thanks to RudiC For This Post:
# 5  
Old 11-22-2015
Your are great RudiC!! It nicely works.
Many thanks for your valuable help.
# 6  
Old 11-22-2015
This one wouldn't need the file supplied on the command line but would try to find the files named in task1 or task2:
Code:
awk ' 
FNR == 1        {FC++
                 FN = FILENAME
                 if (FC < 3)    {OUT[FC] =  FN ".new"
                                 printf "%s", $0 > OUT[FC]
                                 next
                                }
                 printf "\n%s", LIN[FN,1] > OUT[1]
                 printf "\n%s", LIN[FN,2] > OUT[2]
                }
FC < 3          {LIN[$1,FC]   = $0
                 BEG[$1,FC]   = $2
                 STP[$1,FC]   = $3
                 for (i=0; i<=ARGC && $1 != ARGV[i]; i++);
                 if  ((i > ARGC) && (1 == (getline X < $1))) ARGV[ARGC++] = $1
                 close ($1)
                 next
                }

FNR == BEG[FN,1]        {printf "\t%s", $STP[FN,1] > OUT[1]
                        }

FNR == BEG[FN,2]        {L = 1
                        }
L                       {printf "\t%s", $2 > OUT[2]
                        }
FNR == STP[FN,2]        {L = 0
                        }

END                     {printf "\n" > OUT[1]
                         printf "\n" > OUT[2]
                        }
'  task[12]

This User Gave Thanks to RudiC For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Script for extracting data from csv file based on column values.

Hi all, I am new to shell script.I need your help to write a shell script. I need to write a shell script to extract data from a .csv file where columns are ',' separated. The file has 5 columns having values say column 1,column 2.....column 5 as below along with their valuesm.... (3 Replies)
Discussion started by: Vivekit82
3 Replies

2. Shell Programming and Scripting

Extracting lines from text files in folder based on the numbers in another file

Hello, I have a file ff.txt that looks as follows *ABNA.txt 356 24 36 112 *AC24.txt 457 458 321 2 ABNA.txt and AC24.txt are the files in the folder named foo1. Based on the numbers in the ff.txt file, I want to extract the lines from the corresponding files in the foo1 folder and... (2 Replies)
Discussion started by: mohamad
2 Replies

3. UNIX for Dummies Questions & Answers

Extracting lines from a text file based on another text file with line numbers

Hi, I am trying to extract lines from a text file given a text file containing line numbers to be extracted from the first file. How do I go about doing this? Thanks! (1 Reply)
Discussion started by: evelibertine
1 Replies

4. UNIX for Dummies Questions & Answers

Extracting rows from a text file based on the values of two columns (given ranges)

Hi, I have a tab delimited text file with multiple columns. The second and third columns include numbers that have not been sorted. I want to extract rows where the second column includes a value between -0.01 and 0.01 (including both numbers) and the first third column includes a value between... (1 Reply)
Discussion started by: evelibertine
1 Replies

5. Shell Programming and Scripting

Split a file into multiple files based on line numbers and first column value

Hi All I have one query,say i have a requirement like the below code should be move to diffent files whose maximum lines can be of 10 lines.Say in the below example,it consist of 14 lines. This should be moved logically using the data in the fisrt coloumn to file1 and file 2.The data of first... (2 Replies)
Discussion started by: sarav.shan
2 Replies

6. Shell Programming and Scripting

Generate Codes based on start and End values of numbers in a column

Hello All, Could you please help with this. This is what I have: 506234.222 2 506234.222 2 506234.222 2 506234.222 2 508212.200 2 508212.200 2 333456.111 2 333456.111 2 333456.111 2 333456.111 2 But this is what I want: 506234.222 1 506234.222 2 506234.222 2 506234.222 3 (5 Replies)
Discussion started by: canimba
5 Replies

7. UNIX for Dummies Questions & Answers

Extracting rows from a space delimited text file based on the values of a column

I have a space delimited text file. I want to extract rows where the third column has 0 as a value and write those rows into a new space delimited text file. How do I go about doing that? Thanks! (2 Replies)
Discussion started by: evelibertine
2 Replies

8. UNIX for Dummies Questions & Answers

Extracting rows from a text file based on numerical values of a column

I have a text file where the second column is a list of numbers going from small to large. I want to extract the rows where the second column is smaller than or equal to 0.0001. My input: rs10082730 9e-08 12 46002702 rs2544081 1e-07 12 46015487 rs1425136 1e-06 7 35396742 rs2712590... (1 Reply)
Discussion started by: evelibertine
1 Replies

9. UNIX for Dummies Questions & Answers

Extracting rows from a text file based on the first column

I have a tab delimited text file where the first column can take on three different values : 100, 150, 250. I want to extract all the rows where the first column is 100 and put them into a separate text file and so on. This is what my text file looks like now: 100 rs3794811 0.01 0.3434... (1 Reply)
Discussion started by: evelibertine
1 Replies

10. Shell Programming and Scripting

Joining multiple files based on one column with different and similar values (shell or perl)

Hi, I have nine files looking similar to file1 & file2 below. File1: 1 ABCA1 1 ABCC8 1 ABR:N 1 ACACB 1 ACAP2 1 ACOT1 1 ACSBG 1 ACTR1 1 ACTRT 1 ADAMT 1 AEN:N 1 AKAP1File2: 1 A4GAL 1 ACTBL 1 ACTL7 (4 Replies)
Discussion started by: seqbiologist
4 Replies
Login or Register to Ask a Question