Reversing large data set with awk?


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Reversing large data set with awk?
# 1  
Old 12-28-2017
Question Reversing large data set with awk?

Hello!

I have quite a bit of map data that I have to edit. I originally had a DOS script that would reverse x1, y1 coordinates in order to change the direction of a particular segment in a map file. It worked wonderfully and all was well, but my bossman told me that there is a boatload of nodes that make up the segments I have flipped. I was not made aware of this when working with DOS.

I have flipped the segment x1, y1 - x2, y2 >> x2, y2 - x1, y1 in order to change the direction of the segments contained in the .asc file I am working with. But!! Now I have a file that contains all of the node data for each segment that exists on the map file in question. The x and y values are now correct and have been flipped so that the streets are going in the correct direction, but now the node values are backward in regards to how the streets are numerically valued(x / y). (this is Edulog data if anyone is curious).

What I need to be able to do is essentially flip the node data (like I did with the segment data)
  • A segment represents a street
  • A street is made up of nodes
  • Each street has a direction either going left to right or right to left.

My task is to reverse the direction of the segments(which I did in my DOS script) and also reverse the incrementation of the nodes contained within each segment. Each node has its own X1, Y1 data that increments in the direction that the segment is going in. (Like addresses going up a street, a way to represent that you are going up / down the street)

I would like to run a script that would reverse the direction (#) of the nodes:

Example of what the data looks like - The data is read from left to right.
(a more thorough example is pasted below)

Code:
23732 N 23732 N 3 Y (1) 1035678 406785 (2) 1035676 406814 (3) 1035668 406858

After the script is run the data would (ideally) look like:
Code:
23732 N 23732 N 3 Y (3) 1035668 406858 (2) 1035676 406814 (1) 1035678 406785

I think the easiest way to accomplish this is to leave the node x,y values alone and just reverse the order in which they are numbered (the # values in wrapped in ()) like I have shown above.

So far, the most node points that a segment has had is 105 Nodes, 105 X1, Y1 pairs that I would like to reverse. In the future, some files may have more than 105 X, Y coordinate values, or, more than 105 nodes).

The approach I have in mind is to have a script that will create 105 empty variable objects at execution
  • It will process each line of data and assign each node and its two values to corresponding objects containing the N#, X1, Y1
  • It will then go through and reverse the order of the variable objects
  • n(3) X1 Y1 n(2) X1 Y1 n3 X1 Y1...
  • Spit it out into a new file
  • Clear the variables
  • Move on to the next line and repeat the process until there are no more lines to process
I'm having trouble accounting for the possibility of more than 105 potential node values to process Smilie

Example of the data I am working with:
Code:
40 Y 40 N 1 N (1) 961884 641632
41 Y 41 N 1 N (1) 967487 627129
42 Y 42 N 1 N (1) 967424 627104
44 Y 44 N 1 N (1) 977911 620540
46 Y 46 N 2 N (1) 979073 620398 (2) 978884 620434
47 Y 47 N 1 N (1) 977997 620602
48 Y 48 N 4 N (1) 979093 620314 (2) 979004 620332 (3) 978913 620350 (4) 978640 620401
56 Y 56 N 1 N (1) 979284 568834
57 Y 57 N 5 N (1) 979494 568276 (2) 979231 568622 (3) 979210 568652 (4) 978921 569034

I have attached an example file of the ShapeData Smilie

Last edited by vbe; 12-28-2017 at 12:46 PM.. Reason: code tags please
# 2  
Old 12-28-2017
Like so, for example? Give this a try:
Code:
awk '{for(i=1; i<=6; i++) print $i; for(i=NF-2; i>=7; i-=3) print $i, $(i+1), $(i+2); printf RS}' ORS=" " file


Last edited by Scrutinizer; 01-02-2018 at 02:51 PM..
# 3  
Old 12-29-2017
You can use sed too.
If the data don't contain _
Code:
sed 's/(/_/g;:A;s/\([^_]*\)\(.*\)_\([^_]*\)/\1(\3 \2/;tA;s/  / /g;s/  / /g;s/ $//' infile

This User Gave Thanks to ctac_ For This Post:
# 4  
Old 01-02-2018
Thank you so much for your reply! Also happy new year!
I am attempting to get this particular line of code to work but every time I run it with Command Prompt (after installing awk on this machine) it spits out a blank file. I have name script "nodesTwo" and in command prompt (in order to invoke the script) I am typing the following (Where ShapeData.txt is the file I am running the script on and ShapeDataFixed.txt is the intended output file)

Code:
awk nodesTwo ShapeData.txt>ShapeDataFixed.txt

Also!
In the line of code that you so kindly provided, there is an 'f' following the word 'print' just before RS}'
Was this a typo or is this required? is this telling the script to print file?
Thanks again! Your knowledge and experience is very much appreciated Smilie

I also tried to place ShapeData.txt>ShapeDataFixed.txt into the awk script itself and run the script but the command prompt just hangs and seems to be doing nothing...
Your beautiful example in practice (EXSmilie

Code:
awk '{for(i=1; i<=6; i++) print $i; for(i=NF-2; i>=7; i-=3) print $i, $(i+1), $(i+2); printf RS}' ORS=" " ShapeData.txt>ShapeDataFixed.txt

When I attempt to run this entire line (instead of just invoking the script inside the command prompt, I am given a (the system cannot find the file specified) despite it being inside the same directory of the currently executing script?
So sorry for the basic vibes I'm giving off! I've never used awk before this!
So any and all help is so so appreciated Smilie Thank you for your time and patience!

---------- Post updated at 08:56 AM ---------- Previous update was at 08:53 AM ----------

Quote:
Originally Posted by ctac_
You can use sed too.
If the data don't contain _
Code:
sed 's/(/_/g;:A;s/\([^_]*\)\(.*\)_\([^_]*\)/\1(\3 \2/;tA;s/  / /g;s/  / /g;s/ $//' infile

This is excellent! So appreciative to have another approach to this issue Linux Thank you for your suggestion and example! Both are gorgeous!
I try and run the script on a specific file in the directory of the executing script and it is giving me a "line 2: unterminated 's' command" I'm having trouble locating where the particular s command is! so many /'s and \'s!

Also!
Maybe I am not executing the script correctly? I hope I am!
I am running it with

sed -f nodes.sed

where 'nodes.sed' is the name of the file I would like the script to execute on (this file is in the same directory as the script)
Any and all help is appreciated!

Last edited by Scrutinizer; 01-02-2018 at 01:00 PM.. Reason: A different approach!; [mod] code tags
# 5  
Old 01-02-2018
What is your OS and version?
# 6  
Old 01-02-2018
Hello! Smilie
OS: Windows 7
Version: 6.1
# 7  
Old 01-02-2018
In that case save this as "script.awk" or a name of your choice:
Code:
BEGIN {
  ORS=FS
}
{
  for(i=1; i<=6; i++)
    print $i
  for(i=NF-2; i>=7; i-=3)
    print $i, $(i+1), $(i+2)
  printf RS
}

And execute like this:
Code:
awk -f script.awk file > outfile

Login or Register to Ask a Question

Previous Thread | Next Thread

9 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

How to make awk command faster for large amount of data?

I have nginx web server logs with all requests that were made and I'm filtering them by date and time. Each line has the following structure: 127.0.0.1 - xyz.com GET 123.ts HTTP/1.1 (200) 0.000 s 3182 CoreMedia/1.0.0.15F79 (iPhone; U; CPU OS 11_4 like Mac OS X; pt_br) These text files are... (21 Replies)
Discussion started by: brenoasrm
21 Replies

2. Programming

C++ help in large data set

Hi All, We are trying to replace a 3rdparty where we don't know how they handled the reader part here. The query below is getting 197 * 2038017 row in the table. In the below code we are trying to run the query and execute in the DB part and fetch and read the record. That is where it is... (1 Reply)
Discussion started by: arunkumar_mca
1 Replies

3. Shell Programming and Scripting

awk : Filter a set of data to parse header line and last field of multiple same match.

Hi Experts, I have a data with multiple entry , I want to filter PKG= & the last column "00060110" or "00088150" in the output file: ############################################################################################### PKG= P8SDB :: VGS = vgP8SOra vgP8SDB1 vgP8S001... (5 Replies)
Discussion started by: rveri
5 Replies

4. UNIX for Dummies Questions & Answers

Reversing line and word order using awk

Hello, I am new to awk and I was wandering if I could reverse line and word order from a text file using awk. I figured out how to do them both separately, but can't quite figure out how to mix them. Example: Input file: dog cat mouse 1 2 3 I am new to awk Output of the awk program:... (3 Replies)
Discussion started by: blink_w
3 Replies

5. Shell Programming and Scripting

Using AWK to separate data from a large XML file into multiple files

I have a 500 MB XML file from a FileMaker database export, it's formatted horribly (no line breaks at all). The node structure is basically <FMPXMLRESULT> <METADATA> <FIELD att="............." id="..."/> </METADATA> <RESULTSET FOUND="1763457"> <ROW att="....." etc="...."> ... (16 Replies)
Discussion started by: JRy
16 Replies

6. Shell Programming and Scripting

reversing and appending data in multiple files

Hello, I have a some files that look like this: 0 3 1 5 2 8 3 7 I want to reverse and append the data so it looks like this: 3 7 2 8 1 5 0 3 0 3 1 5 2 8 3 7 I first thought about using cat and tac cleverly with some redirection and pipe in a one-liner but I couldn't get it to... (1 Reply)
Discussion started by: bigfoot
1 Replies

7. Shell Programming and Scripting

Drop common lines at head/tail of a large set of files

Hi! I have a large set of pairs of text files (each pair in their own subdirectory) and each pair shares head/tail (a couple of first and last lines) but differs in the middle part. I need to delete the heads/tails and keep only the middle portions in which they differ. The lengths of heads/tails... (1 Reply)
Discussion started by: dobryden
1 Replies

8. Shell Programming and Scripting

awk and reversing

Hello I'm writing script in awk that reverse order the fields of every line in file. My script have problem with spaces - if there is more spaces between fields in line of file - my script erase them . I want my script work like command "tac" - how to change it ? #!/bin/sh file=$1... (1 Reply)
Discussion started by: scotty_123
1 Replies

9. HP-UX

large file options is set

Can someone tell me the right or exact syntax to check if the large file options is set on a filesystem Thanks! (2 Replies)
Discussion started by: catwomen
2 Replies
Login or Register to Ask a Question