Copying part of a data file into another


 
Thread Tools Search this Thread
# 1  
Copying part of a data file into another

Hi,
I have a large number of data files each containing simple integers from 1 to around 25000 in ascending order. However, they are not in a specific progression; some numbers are missing in each file.
For ex. datfile1 may have the numbers in order 1 2 4 6 7 8 12 ... 24996 24999
while datfile2 may have them in order 1 3 5 6 8 9 ... 24999 25000

For each data file, I want to create a new one which has only the integers greater than 16000.
For ex. newdatfile1 may have numbers 16001 16003 16004 ... 24996 24999

The problem is I know very little about UNIX. So, any help will be highly appreciated.
# 2  
Quote:
Originally Posted by latsyrc
Hi,
I have a large number of data files each containing simple integers from 1 to around 25000 in ascending order. However, they are not in a specific progression; some numbers are missing in each file.
For ex. datfile1 may have the numbers in order 1 2 4 6 7 8 12 ... 24996 24999
while datfile2 may have them in order 1 3 5 6 8 9 ... 24999 25000

For each data file, I want to create a new one which has only the integers greater than 16000.
For ex. newdatfile1 may have numbers 16001 16003 16004 ... 24996 24999

The problem is I know very little about UNIX. So, any help will be highly appreciated.
Is this a homework assignment? If not, what is creating your input files?

What shell are you using?

Are all of the integers on a single line in your input files?
# 3  
Quote:
Is this a homework assignment? If not, what is creating your input files?

What shell are you using?

Are all of the integers on a single line in your input files?

This is not a homework assignment. The integers are the number of days after a certain date and the data is not available for some days, so some days are missing. Its actually for a project I'm working on. I am creating a regression model using the data via ampl. Now, I would like to analyse just the data after certain date. So, I thought there must be an easy way to separate out the data I want, using UNIX.

I am using bash and there is one integer on a single line.
# 4  
Here are a couple of ways to do it:
Code:
#!/bin/bash
echo 'processing datfile1 using while loop:'
while read x
do      if [ "$x" -gt 16000 ]
        then    printf "%d\n" "$x"
        fi
done < datfile1
echo 'processing datfile2 using awk:'
awk '$1 > 16000' datfile2

If datfile1 contains:
Code:
1
2
4
6
7       
8       
12
24996
24999

and datfile2 contains:
Code:
1
3
5
6
8       
9       
12
24999
25000

the output produced is:
Code:
processing datfile1 using while loop:
24996
24999
processing datfile2 using awk:
24999
25000

If you are running this on a Solaris system, use /usr/xpg4/bin/awk, /usr/xpg6/bin/awk, or nawk instead of /bin/awk or /usr/bin/awk.

Although this was written using bash (as you requested), this script will work with any shell that recognizes basic Bourne shell syntax.
This User Gave Thanks to Don Cragun For This Post:
# 5  
Thank you so much. But does that also create a new file and write all the "good" data into it without overwriting?
# 6  
somebody here could probably get this same code done easier in ruby or perl or awk but this should at least get you started ...
Code:
#! /bin/bash
int=$1

PATH=/usr/bin:/bin:/usr/sbin:/sbin

cd /dir
for datfile in datfile*
do
   sort -n $datfile > /tmp/${datfile}.tmp
   start=$(grep -n -w $int /tmp/${datfile).tmp | awk -F: '{print $1}')
   sed -n "${start},\$p" /tmp/${datfile}.tmp > /dir/${datfile}.${int}
   rm -f /tmp/${datfile}.tmp 2> /dev/null
done

exit 0

This User Gave Thanks to Just Ice For This Post:
# 7  
Quote:
Originally Posted by latsyrc
Thank you so much. But does that also create a new file and write all the "good" data into it without overwriting?
The scripts shown just write the extracted data to standard output. The original data (datfile1 and datfile2) are not modified. If you want to save the output in files, try something like:
Code:
#!/bin/bash
echo 'processing datfile1 using while loop:'
while read x
do      if [ "$x" -gt 16000 ]
        then    printf "%d\n" "$x"
        fi
done < datfile1 > datfile1.mod
echo 'processing datfile2 using awk:'
awk '$1 > 16000' datfile2 > datfile2.mod

And then look at the contents of the files datfile1.mod and datfile2.mod after it finishes. You could also pipe the output into another stream of processes to do whatever you want with the data without creating temporary files, replace the data in the original files, or millions of other choices depending on what you want to do. If you tell us what you are trying to do, we might be able to suggest saving the output in a manner that would be appropriate for your situation.
This User Gave Thanks to Don Cragun For This Post:
 

|
Thread Tools Search this Thread
Search this Thread:
Advanced Search

More UNIX and Linux Forum Topics You Might Find Helpful
Copying data from one file server to another
br1an
Hello people, I have a question regarding transferring data from one file server to another. The server is a Solaris 9 box The old file server is connected via Ethernet cable, and the new file server we are switching is a Fiber channel. can I use the dd if=server:/app1 of=server2:/app1 ...... Solaris
2
Solaris
Doubt in C programming (copying data from one file to another)
Lyric
Hello, i'm new to the forum and so am i to C programming. Recently i've gotten a task to create a program that will read an existing .bin file and copy the data to a non existing (so i have to create it) .txt file (some type of conversion) Now, i now how to put the arguments, opening and...... Programming
5
Programming
How to get data only inside polygon created by points which is part of whole data from file?
reva
hiii, Help me out..i have a huge set of data stored in a file.This file has has 2 columns which is latitude & longitude of a region. Now i have a program which asks for the number of points & based on this number it asks the user to enter that latitude & longitude values which are in the same...... UNIX for Dummies Questions & Answers
7
UNIX for Dummies Questions & Answers
Copying data from excel file
5ahen
Hii friends, I am a newbie to unix/shell scripting and got stuck in implementing a functionality.Dear experts,kindly spare some time to bring me out of dark pit :confused:.. My requirement is somewhat wierd,let me explain what i have and what i need to do... 1) there are several excel...... Shell Programming and Scripting
1
Shell Programming and Scripting
Copying some part of file
anushree.a
Hey friends, Here I am with another query. I have a TXT file. Foe an example EX ID : B-Mezine . Some lines of text (Not fixed in length n no of lines).. .. END EX ID Some blank lines in between two records(Not fixed in numbers) EX ID : B-Mezine . ... Shell Programming and Scripting
20
Shell Programming and Scripting