Copying part of a data file into another

Tags
beginners

 

 
Thread Tools Search this Thread
# 1  
Old 07-07-2013
Copying part of a data file into another

Hi,
I have a large number of data files each containing simple integers from 1 to around 25000 in ascending order. However, they are not in a specific progression; some numbers are missing in each file.
For ex. datfile1 may have the numbers in order 1 2 4 6 7 8 12 ... 24996 24999
while datfile2 may have them in order 1 3 5 6 8 9 ... 24999 25000

For each data file, I want to create a new one which has only the integers greater than 16000.
For ex. newdatfile1 may have numbers 16001 16003 16004 ... 24996 24999

The problem is I know very little about UNIX. So, any help will be highly appreciated.
# 2  
Old 07-07-2013
Quote:
Originally Posted by latsyrc
Hi,
I have a large number of data files each containing simple integers from 1 to around 25000 in ascending order. However, they are not in a specific progression; some numbers are missing in each file.
For ex. datfile1 may have the numbers in order 1 2 4 6 7 8 12 ... 24996 24999
while datfile2 may have them in order 1 3 5 6 8 9 ... 24999 25000

For each data file, I want to create a new one which has only the integers greater than 16000.
For ex. newdatfile1 may have numbers 16001 16003 16004 ... 24996 24999

The problem is I know very little about UNIX. So, any help will be highly appreciated.
Is this a homework assignment? If not, what is creating your input files?

What shell are you using?

Are all of the integers on a single line in your input files?
# 3  
Old 07-07-2013
Quote:
Is this a homework assignment? If not, what is creating your input files?

What shell are you using?

Are all of the integers on a single line in your input files?

This is not a homework assignment. The integers are the number of days after a certain date and the data is not available for some days, so some days are missing. Its actually for a project I'm working on. I am creating a regression model using the data via ampl. Now, I would like to analyse just the data after certain date. So, I thought there must be an easy way to separate out the data I want, using UNIX.

I am using bash and there is one integer on a single line.
# 4  
Old 07-07-2013
Here are a couple of ways to do it:
Code:
#!/bin/bash
echo 'processing datfile1 using while loop:'
while read x
do      if [ "$x" -gt 16000 ]
        then    printf "%d\n" "$x"
        fi
done < datfile1
echo 'processing datfile2 using awk:'
awk '$1 > 16000' datfile2

If datfile1 contains:
Code:
1
2
4
6
7       
8       
12
24996
24999

and datfile2 contains:
Code:
1
3
5
6
8       
9       
12
24999
25000

the output produced is:
Code:
processing datfile1 using while loop:
24996
24999
processing datfile2 using awk:
24999
25000

If you are running this on a Solaris system, use /usr/xpg4/bin/awk, /usr/xpg6/bin/awk, or nawk instead of /bin/awk or /usr/bin/awk.

Although this was written using bash (as you requested), this script will work with any shell that recognizes basic Bourne shell syntax.
This User Gave Thanks to Don Cragun For This Post:
latsyrc (07-07-2013)
# 5  
Old 07-07-2013
Thank you so much. But does that also create a new file and write all the "good" data into it without overwriting?
# 6  
Old 07-07-2013
somebody here could probably get this same code done easier in ruby or perl or awk but this should at least get you started ...
Code:
#! /bin/bash
int=$1

PATH=/usr/bin:/bin:/usr/sbin:/sbin

cd /dir
for datfile in datfile*
do
   sort -n $datfile > /tmp/${datfile}.tmp
   start=$(grep -n -w $int /tmp/${datfile).tmp | awk -F: '{print $1}')
   sed -n "${start},\$p" /tmp/${datfile}.tmp > /dir/${datfile}.${int}
   rm -f /tmp/${datfile}.tmp 2> /dev/null
done

exit 0

This User Gave Thanks to Just Ice For This Post:
latsyrc (07-07-2013)
# 7  
Old 07-07-2013
Quote:
Originally Posted by latsyrc
Thank you so much. But does that also create a new file and write all the "good" data into it without overwriting?
The scripts shown just write the extracted data to standard output. The original data (datfile1 and datfile2) are not modified. If you want to save the output in files, try something like:
Code:
#!/bin/bash
echo 'processing datfile1 using while loop:'
while read x
do      if [ "$x" -gt 16000 ]
        then    printf "%d\n" "$x"
        fi
done < datfile1 > datfile1.mod
echo 'processing datfile2 using awk:'
awk '$1 > 16000' datfile2 > datfile2.mod

And then look at the contents of the files datfile1.mod and datfile2.mod after it finishes. You could also pipe the output into another stream of processes to do whatever you want with the data without creating temporary files, replace the data in the original files, or millions of other choices depending on what you want to do. If you tell us what you are trying to do, we might be able to suggest saving the output in a manner that would be appropriate for your situation.
This User Gave Thanks to Don Cragun For This Post:
latsyrc (07-07-2013)
 

|
Thread Tools Search this Thread
Search this Thread:
Advanced Search

More UNIX and Linux Forum Topics You Might Find Helpful
SQL: copying data up figaro Programming 3 2 Weeks Ago 04:38 PM
Renaming files with part of their pathname and copying them to new directory jullee UNIX for Dummies Questions & Answers 2 04-23-2014 01:51 PM
Copying data from USB CD Drive. videsh77 Red Hat 3 04-22-2014 03:32 PM
Copying data from one file server to another br1an Solaris 2 05-21-2013 08:36 PM
Copying data from files to directories newbie2010 Shell Programming and Scripting 2 05-17-2013 01:06 PM
[Solved] Printing a part of the last line of the specific part of a file wenclu Shell Programming and Scripting 6 09-04-2012 09:34 AM
SQL: copying data down figaro Programming 8 12-08-2011 02:35 PM
Error while copying huge amount of data in aix samsungsamsung AIX 3 06-01-2011 05:34 PM
Doubt in C programming (copying data from one file to another) Lyric Programming 5 10-12-2010 03:56 PM
How to get data only inside polygon created by points which is part of whole data from file? reva UNIX for Dummies Questions & Answers 7 04-12-2010 12:27 PM
Copying data from excel file 5ahen Shell Programming and Scripting 1 02-04-2009 10:39 AM
Compress the contents of a directory while copying data into it user1602 Shell Programming and Scripting 2 01-12-2009 08:53 PM
Copying some part of file anushree.a Shell Programming and Scripting 20 12-08-2008 05:51 AM
Copying/Routing data from the0 to eth1 and eth2 theVOID IP Networking 0 04-11-2008 05:19 AM
copying data to raw devices using 'dd' Neville UNIX for Advanced & Expert Users 4 03-23-2006 06:01 PM