Want to extract certain lines from big file


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Want to extract certain lines from big file
# 8  
Old 01-24-2016
Extracting the first transaction only.
Code:
perl -ne 'print if /^##transaction\b/ .. /EOT$/; last if /EOT$/' mad_man.example

Code:
##transaction, , , ,blah, blah
%%blah~trannum~blah~blah~blah
0000content01
0001content02
.
.
0010contentnn
0000EOT

---------- Post updated at 09:21 PM ---------- Previous update was at 08:54 PM ----------

Extract the first two transactions:
Code:
perl -ne 'print if /^##transaction\b/ .. /EOT$/; /EOT$/ and ++$n; last if $n==2' mad_man.example

Code:
##transaction, , , ,blah, blah
%%blah~trannum~blah~blah~blah
0000content01
0001content02
.
.
0010contentnn
0000EOT
##transaction, , , ,blah, blah
%%blah~trannum~blah~blah~blah
0000content03
0001content04
.
.
0010contentnn
0000EOT

Extract only the second transaction:

Code:
perl -ne 'if(/^##transaction\b/ .. /EOT$/){ print if $n==1; /EOT$/ and ++$n }; last if $n==2' mad_man.example

Code:
##transaction, , , ,blah, blah
%%blah~trannum~blah~blah~blah
0000content03
0001content04
.
.
0010contentnn
0000EOT


Extracting any transaction by using a variable:

Code:
export t=2; perl -ne 'if(/^##transaction\b/ .. /EOT$/){ print if $n==$ENV{t}; /EOT$/ and ++$n }; last if $n==$ENV{t}+1' mad_man.example

Code:
##transaction, , , ,blah, blah
%%blah~transnum~blah~blah~blah
0000content05
0001content06
.
.
0010contentnn
0000EOT

Extract last transaction:

Code:
perl -ne '/^##transaction\b/ and @t=(); push @t, $_ if /^##transaction\b/ .. /EOT$/; END{print @t}' mad_man.example

Code:
##transaction, , , ,blah, blah
%%blah~transnum~blah~blah~blah
0000content05
0001content06
.
.
0010contentnn
0000EOT


Last edited by Aia; 01-24-2016 at 12:40 AM..
This User Gave Thanks to Aia For This Post:
# 9  
Old 01-24-2016
Hi Don,

The transnum is alphanumeric, with no special characters and it will be always 19 in length.

The maximum no of characters in a line is limited to 1600 and for a transaction set it can be upto ~ 4000 to 5000 characters.

---------- Post updated at 10:18 AM ---------- Previous update was at 10:11 AM ----------

HI Aia,

Thanks i am going to try all your solutions too. will update here about it.

Thanks.
# 10  
Old 01-24-2016
Hello mad_man,

In your first post you mentioned:
Quote:
I want to copy the ##transaction to till the next EOT for the particular transnum.
However, all your transnum are the same in your example. How would you choose what particular transnum. In what way are they different in your file?
# 11  
Old 01-24-2016
Hi Aia,

The transnum are alpha numeric and they will be unique for each set of transactions.

Thanks.

---------- Post updated at 11:58 AM ---------- Previous update was at 11:36 AM ----------

Hi Aia,

The required transaction set will be decided by the transaction reference number 'transnum' from another file. This value i will be extracting from another file, as i explained this already to Don the script is a large script in which the transaction extraction is a part of it. when script reaches this section a variable will be holding the transnum value. So using it i will take out the particular transaction set. let me know if you have any other queries

Thanks

---------- Post updated at 01:44 PM ---------- Previous update was at 11:58 AM ----------

Hi Don,

Thanks for your command

Code:
awk '{p=p $0 RS} /EOT/{if(p~s){printf "%s",p;exit}else p=x}' s="$transnum" $file > $file_new

Worked the way in which i required. By when i discussed this with my prior he said AWK commands are not allowed by our onsite counterparts since they are giving issue when we upgrade the AIX and leads us to fix them again. So any SED or perl equivalent to the above AWK would be helpful for me. Kindly help me out

Thanks.

---------- Post updated at 01:52 PM ---------- Previous update was at 01:44 PM ----------

Hi Aia,

Thanks for your command

Code:
export t=2; perl -ne 'if(/^##transaction\b/ .. /EOT$/){ print if $n==$ENV{t}; /EOT$/ and ++$n }; last if $n==$ENV{t}+1' mad_man.example

This is not working. I exported the value of transnum to variable t. The output file doesn,t have the required output.

Please find one of the existing inline perl we use. If you give me your command in the same format it will be helpful

Code:
/usr/local/perl/bin/perl -e '$record = $ENV{"record"};' -e '@fields=split(/~/,$record);' -e '$req_flag=uc $fields[29];' -e 'print "$req_flag\n";' > /tmp/$file_name

What the above code will do is it will export a value which is tilde seperated and get the 29th field to a temp file. This is just a sample code the reason why i pasted here is to show you the existing code punctuation. Now i am purely dependent on perl or sed kindly help me.

Thanks
# 12  
Old 01-24-2016
How about
Code:
sed -n '/~transnum~/ {H;g}; /~transnum~/,/EOT/p;h' file

This User Gave Thanks to RudiC For This Post:
# 13  
Old 01-24-2016
Hi RudiC

I am getting the error cannot be parsed. for this sed command.
Please find below how i used.
transnum="ABC160120XYZ0983921"

Code:
sed -n '/"$transnum"/ {H;g}; /"$transnum"/,/EOT/p;h' $file > $file_new

please suggest
# 14  
Old 01-24-2016
It sounds like I have wasted the last hour of my life trying to help you, but maybe this will help someone else. The following awk script only uses POSIX specified awk features and should work on any system (although you would need to change awk to /usr/xpg4/bin/awk or nawk if and only if you want to run this on a Solaris/SunOS system). It takes two files as inputs (which is what you said you had earlier). The first file (named trannums in this script) contains one or more lines with each line containing a transaction number to be extracted from your big file. The second file (named bigfile in this script) contains your big file containing transactions. It extracts each transaction listed in trannums into a separate output file with a name that is the string TX: followed by the transaction number:
Code:
#!/bin/ksh
awk -F '~' '
FNR == NR {
	# Gather transaction numbers...
	t[$1]
	tc = FNR
	next
}
{	# Gather transaction lines.
	l[++lc] = $0
}
$1 == "%%YEDTRN" && $2 in t {
	# We have found a transaction number for a transaction that is to be
	# extracted.  Save the transaction number and remove this transaction
	# from the remaining transaction list.
	remove t[transnum = $2]
	tc--
}
$1 == "0000EOT" {
	# If we have a transaction that is to be printed, print it.
	if(transnum) {
		# Print the transaction.
		for(i = 1; i <= lc; i++)
			print l[i] > ("TX:" transnum)
		close("TX:" transnum)
		printf("Transaction #%s extracted to file TX:%s\n", transnum,
		    transnum)
	}
	# Was this the last remaining transaction to be extracted?
	if(tc) {# No.  Reset for next transaction.
		lc = 0
		transnum = ""
	} else {# Yes.  Exit.
		exit
	}
}' trannums bigfile

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Extract Big and continuous regions

Hi all, I have a file like this I want to extract only those regions which are big and continous chr1 3280000 3440000 chr1 3440000 3920000 chr1 3600000 3920000 # region coming within the 3440000 3920000. so i don't want it to be printed in output chr1 3920000 4800000 chr1 ... (2 Replies)
Discussion started by: amrutha_sastry
2 Replies

2. UNIX for Beginners Questions & Answers

How to copy only some lines from very big file?

Dear all, I have stuck with this problem for some days. I have a very big file, this file can not open by vi command. There are 200 loops in this file, in each loop will have one line like this: GWA quasiparticle energy with Z factor (eV) And I need 98 lines next after this line. Is... (6 Replies)
Discussion started by: phamnu
6 Replies

3. Shell Programming and Scripting

Extract certain columns from big data

The dataset I'm working on is about 450G, with about 7000 colums and 30,000,000 rows. I want to extract about 2000 columns from the original file to form a new file. I have the list of number of the columns I need, but don't know how to extract them. Thanks! (14 Replies)
Discussion started by: happypoker
14 Replies

4. Shell Programming and Scripting

Extract certain entries from big file:Request to check

Hi all I have a big file which I have attached here. And, I have to fetch certain entries and arrange in 5 columns Name Drug DAP ID disease approved or notIn the attached file data is arranged with tab separated columns in this way: and other data is... (2 Replies)
Discussion started by: manigrover
2 Replies

5. UNIX for Advanced & Expert Users

Delete first 100 lines from a BIG File

Hi, I need a unix command to delete first n (say 100) lines from a log file. I need to delete some lines from the file without using any temporary file. I found sed -i is an useful command for this but its not supported in my environment( AIX 6.1 ). File size is approx 100MB. Thanks in... (18 Replies)
Discussion started by: unohu
18 Replies

6. Shell Programming and Scripting

Extract some lines from one file and add those lines to current file

hi, i have two files. file1.sh echo "unix" echo "linux" file2.sh echo "unix linux forums" now the output i need is $./file2.sh unix linux forums (3 Replies)
Discussion started by: snreddy_gopu
3 Replies

7. Shell Programming and Scripting

Re: Deleting lines from big file.

Hi, I have a big (2.7 GB) text file. Each lines has '|' saperator to saperate each columns. I want to delete those lines which has text like '|0|0|0|0|0' I tried: sed '/|0|0|0|0|0/d' test.txt Unfortunately, it scans the file but does nothing. file content sample:... (4 Replies)
Discussion started by: dipeshvshah
4 Replies

8. Shell Programming and Scripting

Print #of lines after search string in a big file

I have a command which prints #lines after and before the search string in the huge file nawk 'c-->0;$0~s{if(b)for(c=b+1;c>1;c--)print r;print;c=a}b{r=$0}' b=0 a=10 s="STRING1" FILE The file is 5 gig big. It works great and prints 10 lines after the lines which contains search string in... (8 Replies)
Discussion started by: prash184u
8 Replies

9. UNIX for Dummies Questions & Answers

How big is too big a config.log file?

I have a 5000 line config.log file with several "maybe" errors. Any reccomendations on finding solvable problems? (2 Replies)
Discussion started by: NeedLotsofHelp
2 Replies

10. UNIX for Dummies Questions & Answers

How to view a big file(143M big)

1 . Thanks everyone who read the post first. 2 . I have a log file which size is 143M , I can not use vi open it .I can not use xedit open it too. How to view it ? If I want to view 200-300 ,how can I implement it 3 . Thanks (3 Replies)
Discussion started by: chenhao_no1
3 Replies
Login or Register to Ask a Question