Reformat MLS Data - Use AWK?


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Reformat MLS Data - Use AWK?
# 1  
Old 11-01-2011
Reformat MLS Data - Use AWK?

I am helping my wife set up a real estate site and I am starting to integrate MLS listings. We are using a HostGator level 5 VPS running CentOS and have full root and SSH access to the VPS.

Thus far I have automated the daily FTP download of listings from our MLS server using a little sh script. It is a 90MB text file, pipe-delimited with about 150 fields.

Also, I have settled on using the IProperty component from TheThinkery which is a Joomla extension as the basis for the property search engine. IP uses a comma-delimited text file for input/output and has a different order of the fields.

Basically I just need to
1. convert pipe-delimited to comma-delimited and add double-quotes for all text fields.
2. re-order the fields.

Is AWK or Perl the way to go? I have confirmed AWK is installed. I used to have an O-Reilly AWK book fifteen years ago and actually read the whole thing believe it or not, but I haven't written a line of AWK in probably 10 years. Smilie

Would someone be willing to help me with this for a fee? I assume if you know what you're doing it would take less than an hour. I will do the grunt work re-ordering fields.

By the way, I came across this forum when searching and landed on this thread --
"Convert CSV file (with double quoted strings) to pipe delimited file"

That's sort of like what I want to do except the opposite. (Also I need to re-order fields).

Thanks!
Eric
# 2  
Old 11-01-2011
You can start with this skeleton.

Consider "mx" as text fields - do the same for each one.
Code:
#!/usr/bin/ksh
IFS='|'
while read m1 m2 m3 ... mn; do
  mx='"'${mx}'"'
  echo ${m1}','${m2}','${m3}','...${mn}','
done < Inp_File

This User Gave Thanks to Shell_Life For This Post:
# 3  
Old 11-01-2011
Quote:
Originally Posted by Chicago_Realtor
Basically I just need to
1. convert pipe-delimited to comma-delimited and add double-quotes for all text fields.
2. re-order the fields.

Is AWK or Perl the way to go?
Almost anything will work. awk would probably be simpler for such straightforward translation:
Code:
awk -v FS="|" -v OFS="," { for(N=1; N<=NF; N++) $N="\"" $N "\"" } 1' < input > output

As for reordering, depends entirely on what you want to put where.

---------- Post updated at 02:17 PM ---------- Previous update was at 02:11 PM ----------

Here's an improved version which you can just feed the order of fields you want into:

Code:
awk 'BEGIN {
    FS="|"
    split("5,4,3,2,1", O, ","); # The order of fields you want, 1 being first
}
{ 
        P=""
        for(N=1; O[N]; N++)
        {
                M=O[N];
                printf("%s\"%s\"", P, $M);
                P=","
        }
        printf("\n");
}' < input > output

That will print field "field 5","field 4","field 3","field 2","field 1". Change "5,4,3,2,1" into whatever you want.
This User Gave Thanks to Corona688 For This Post:
# 4  
Old 11-01-2011
Fantastic! How much do I owe you's? Or I can donate to your favorite charities.

Eric
# 5  
Old 11-02-2011
I'm not picky. Wasn't technically charging for the work.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

awk to reformat output if input file is empty, but not if file has data in it

The below awk improved bu @MadeInGermany, works great as long as the input file has data in it in the below format: input chrX 25031028 25031925 chrX:25031028-25031925 ARX 631 18 chrX 25031028 25031925 chrX:25031028-25031925 ARX 632 14... (3 Replies)
Discussion started by: cmccabe
3 Replies

2. Shell Programming and Scripting

Help with reformat data set

Input file 4CL1 O24145 CoA1 4CL1 P31684 CoA1 4CL1 Q54P77 CoA_1 73 O36421 Unknown 4CL3 Q9S777 coumarate 4CL3 Q54P79 coumarate 4CL3 QP7932 coumarate Desired output result 4CL1 O24145#P31684 CoA1 4CL1 Q54P77 CoA_1 73 O36421 Unknown 4CL3 Q9S777#Q54P79#QP7932 coumarate I... (5 Replies)
Discussion started by: perl_beginner
5 Replies

3. Shell Programming and Scripting

Help with reformat data structure

Input file: bv|111259484|pir||T49736_real_data bv|159484|pir||T9736_data_figure bv|113584|prf|T4736|truth bv|113584|pir||T4736_truth Desired output: bv|111259484|pir|T49736|real_data bv|159484|pir|T9736|data_figure bv|113584|prf|T4736|truth bv|113584|pir|T4736|truth Once the... (8 Replies)
Discussion started by: perl_beginner
8 Replies

4. Shell Programming and Scripting

Data reformat and rearrangement problem asking

Input file: dependent general_process dependent general_process regulation general_process - - template component food component binding data_rearrangement binding data_rearrangement specific_activity data_rearrangement - ... (7 Replies)
Discussion started by: cpp_beginner
7 Replies

5. Shell Programming and Scripting

Help with reformat input data

Input file: 58227131 50087390 57339526 40578034 65348841 55614853 64363217 44178559 Desired output file: 58227131 50087390 57339526 40578034 65348841 55614853 64363217 44178559 Command that I try: (4 Replies)
Discussion started by: perl_beginner
4 Replies

6. Shell Programming and Scripting

Help with reformat data content

input file: hsa-miR-4726-5p Score hsa-miR-483-5p Score hsa-miR-125b-2* Score hsa-miR-4492 hsa-miR-4508 hsa-miR-4486 Score Desired output file: hsa-miR-4726-5p Score hsa-miR-483-5p Score hsa-miR-125b-2* Score hsa-miR-4492 hsa-miR-4508 hsa-miR-4486 Score ... (6 Replies)
Discussion started by: perl_beginner
6 Replies

7. Shell Programming and Scripting

Reformat the data of a file.

I have a file which have data like A.txt a 1Jan I am in a1. 1Jan I was born. 2Jan I am here. 3Jan I am in a3. b 1Jan I am in b1. c 2Jan I am in c2. d 2Jan I am in d2. 5jan I am in d5. date in the file might be vary evertime. (9 Replies)
Discussion started by: samkhu
9 Replies

8. Shell Programming and Scripting

reformat data with a shell script

Can anyone help me with a shell script that can do the following: I have a data in fasta format (first line is the header, followed by a sequence of characters). >ALLLY GGCCCCTCGAGCCTCGAACCGGAACCTCCAAATCCGAGACGCTCTGCTTATGAGGACCTC GAAATATGCCGGCCAGTGAAAAAATCTTGTGGCTTTGAGGGCTTTTGGTTGGCCAGGGGC... (5 Replies)
Discussion started by: manishabh
5 Replies

9. Shell Programming and Scripting

Reformat Data (Perl)

I am new to Perl. I need to reformat a data file as the last part of a script I am working on. I am stuck on this. Here is the current format: CUSTOMER Filename 09/04/07-08:49 CUSTOMER Filename 09/04/07-08:52 CUSTOMER Filename 09/04/07-08:52 CUSTOMER2 Filename 09/04/07-08:49 CUSTOMER2... (3 Replies)
Discussion started by: flood
3 Replies

10. Shell Programming and Scripting

help reformat data with awk

I am trying to write an awk program to reformat a data table and convert the date to julian time. I have all the individual steps working, but I am having some issues joing them into one program. Can anyone help me out? Here is my code so far: # This is an awk program to convert the dates from... (4 Replies)
Discussion started by: climbak
4 Replies
Login or Register to Ask a Question