Copying part of a data file into another


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers Copying part of a data file into another
# 1  
Old 07-07-2013
Copying part of a data file into another

Hi,
I have a large number of data files each containing simple integers from 1 to around 25000 in ascending order. However, they are not in a specific progression; some numbers are missing in each file.
For ex. datfile1 may have the numbers in order 1 2 4 6 7 8 12 ... 24996 24999
while datfile2 may have them in order 1 3 5 6 8 9 ... 24999 25000

For each data file, I want to create a new one which has only the integers greater than 16000.
For ex. newdatfile1 may have numbers 16001 16003 16004 ... 24996 24999

The problem is I know very little about UNIX. So, any help will be highly appreciated.
# 2  
Old 07-07-2013
Quote:
Originally Posted by latsyrc
Hi,
I have a large number of data files each containing simple integers from 1 to around 25000 in ascending order. However, they are not in a specific progression; some numbers are missing in each file.
For ex. datfile1 may have the numbers in order 1 2 4 6 7 8 12 ... 24996 24999
while datfile2 may have them in order 1 3 5 6 8 9 ... 24999 25000

For each data file, I want to create a new one which has only the integers greater than 16000.
For ex. newdatfile1 may have numbers 16001 16003 16004 ... 24996 24999

The problem is I know very little about UNIX. So, any help will be highly appreciated.
Is this a homework assignment? If not, what is creating your input files?

What shell are you using?

Are all of the integers on a single line in your input files?
# 3  
Old 07-07-2013
Quote:
Is this a homework assignment? If not, what is creating your input files?

What shell are you using?

Are all of the integers on a single line in your input files?

This is not a homework assignment. The integers are the number of days after a certain date and the data is not available for some days, so some days are missing. Its actually for a project I'm working on. I am creating a regression model using the data via ampl. Now, I would like to analyse just the data after certain date. So, I thought there must be an easy way to separate out the data I want, using UNIX.

I am using bash and there is one integer on a single line.
# 4  
Old 07-07-2013
Here are a couple of ways to do it:
Code:
#!/bin/bash
echo 'processing datfile1 using while loop:'
while read x
do      if [ "$x" -gt 16000 ]
        then    printf "%d\n" "$x"
        fi
done < datfile1
echo 'processing datfile2 using awk:'
awk '$1 > 16000' datfile2

If datfile1 contains:
Code:
1
2
4
6
7       
8       
12
24996
24999

and datfile2 contains:
Code:
1
3
5
6
8       
9       
12
24999
25000

the output produced is:
Code:
processing datfile1 using while loop:
24996
24999
processing datfile2 using awk:
24999
25000

If you are running this on a Solaris system, use /usr/xpg4/bin/awk, /usr/xpg6/bin/awk, or nawk instead of /bin/awk or /usr/bin/awk.

Although this was written using bash (as you requested), this script will work with any shell that recognizes basic Bourne shell syntax.
This User Gave Thanks to Don Cragun For This Post:
# 5  
Old 07-07-2013
Thank you so much. But does that also create a new file and write all the "good" data into it without overwriting?
# 6  
Old 07-07-2013
somebody here could probably get this same code done easier in ruby or perl or awk but this should at least get you started ...
Code:
#! /bin/bash
int=$1

PATH=/usr/bin:/bin:/usr/sbin:/sbin

cd /dir
for datfile in datfile*
do
   sort -n $datfile > /tmp/${datfile}.tmp
   start=$(grep -n -w $int /tmp/${datfile).tmp | awk -F: '{print $1}')
   sed -n "${start},\$p" /tmp/${datfile}.tmp > /dir/${datfile}.${int}
   rm -f /tmp/${datfile}.tmp 2> /dev/null
done

exit 0

This User Gave Thanks to Just Ice For This Post:
# 7  
Old 07-07-2013
Quote:
Originally Posted by latsyrc
Thank you so much. But does that also create a new file and write all the "good" data into it without overwriting?
The scripts shown just write the extracted data to standard output. The original data (datfile1 and datfile2) are not modified. If you want to save the output in files, try something like:
Code:
#!/bin/bash
echo 'processing datfile1 using while loop:'
while read x
do      if [ "$x" -gt 16000 ]
        then    printf "%d\n" "$x"
        fi
done < datfile1 > datfile1.mod
echo 'processing datfile2 using awk:'
awk '$1 > 16000' datfile2 > datfile2.mod

And then look at the contents of the files datfile1.mod and datfile2.mod after it finishes. You could also pipe the output into another stream of processes to do whatever you want with the data without creating temporary files, replace the data in the original files, or millions of other choices depending on what you want to do. If you tell us what you are trying to do, we might be able to suggest saving the output in a manner that would be appropriate for your situation.
This User Gave Thanks to Don Cragun For This Post:
 
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Grep a section from an UNIX file obtaining only part of the data

Hello, I have a log file that has several sections "BEGIN JOB, End of job" like in the following example: 19/06/12 - 16:00:57 (27787398-449294): BEGIN JOB j1(27787398-449294) JOB1 19/06/12 - 16:00:57 (27787398-449294): DIGIT: 0 number of present logs : 1 19/06/12 - 16:00:57... (4 Replies)
Discussion started by: mvalonso
4 Replies

2. Programming

SQL: copying data up

I need to fix an SQL statement in MySQL that should calculate a field using values from two of the columns and I prefer to do this using set-based programming, ie not procedural. What needs to happen is that in a separate column called "delta" the value of "level" is copied depending on whether... (3 Replies)
Discussion started by: figaro
3 Replies

3. UNIX for Dummies Questions & Answers

Renaming files with part of their pathname and copying them to new directory

Hi I think this should be relatively simple but I can't figure it out. I have several files with the same name in different folders within a directory (the output of a program that I ran). Something like this: ./myAnalysis/item1/round1/myoutput.txt ./myAnalysis/item1/round2/myoutput.txt... (2 Replies)
Discussion started by: jullee
2 Replies

4. Solaris

Copying data from one file server to another

Hello people, I have a question regarding transferring data from one file server to another. The server is a Solaris 9 box The old file server is connected via Ethernet cable, and the new file server we are switching is a Fiber channel. can I use the dd if=server:/app1 of=server2:/app1 ... (2 Replies)
Discussion started by: br1an
2 Replies

5. Shell Programming and Scripting

Copying data from files to directories

I have the following that I'd like to do: 1. I have split a file into separate files that I placed into the /tmp directory. These files are named F1 F2 F3 F4. 2. In addition, I have several directories which are alphabetized as dira dirb dirc dird. 3. I'd like to be able to copy F1 F2 F3 F4... (2 Replies)
Discussion started by: newbie2010
2 Replies

6. Programming

SQL: copying data down

I have a series of observations of which one column is sometimes missing (zero): date temp delta 1977 284.54 29.84 1978 149.82 0 1979 320.71 28.45 1980 176.76 0 1981 854.65 0 1984 817.65 0 1985 990.58 27.98 1986 410.21 0 1987 405.93 0 1988 482.9 0 What I would like to achieve is a... (8 Replies)
Discussion started by: figaro
8 Replies

7. Programming

Doubt in C programming (copying data from one file to another)

Hello, i'm new to the forum and so am i to C programming. Recently i've gotten a task to create a program that will read an existing .bin file and copy the data to a non existing (so i have to create it) .txt file (some type of conversion) Now, i now how to put the arguments, opening and... (5 Replies)
Discussion started by: Lyric
5 Replies

8. UNIX for Dummies Questions & Answers

How to get data only inside polygon created by points which is part of whole data from file?

hiii, Help me out..i have a huge set of data stored in a file.This file has has 2 columns which is latitude & longitude of a region. Now i have a program which asks for the number of points & based on this number it asks the user to enter that latitude & longitude values which are in the same... (7 Replies)
Discussion started by: reva
7 Replies

9. Shell Programming and Scripting

Copying data from excel file

Hii friends, I am a newbie to unix/shell scripting and got stuck in implementing a functionality.Dear experts,kindly spare some time to bring me out of dark pit :confused:.. My requirement is somewhat wierd,let me explain what i have and what i need to do... 1) there are several excel... (1 Reply)
Discussion started by: 5ahen
1 Replies

10. Shell Programming and Scripting

Copying some part of file

Hey friends, Here I am with another query. I have a TXT file. Foe an example EX ID : B-Mezine .... ... ... Some lines of text (Not fixed in length n no of lines).. ... ... .. END EX ID Some blank lines in between two records(Not fixed in numbers) EX ID : B-Mezine .... ...... (20 Replies)
Discussion started by: anushree.a
20 Replies
Login or Register to Ask a Question