Data extraction and converting into .csv file.

 
Thread Tools Search this Thread
Top Forums UNIX for Beginners Questions & Answers Data extraction and converting into .csv file.
# 1  
Old 03-01-2018
Data extraction and converting into .csv file.

Hi All,

I have a data file and need to extract and convert it into csv format:
1) Read and extract the line containing string ending with "----" (file sample_linebyline.txt file) and to make a .csv file from this.

2) To read the flat file flatfile_sample.txt which consists of similar data ( but in one single line) and extract the line containing string ending with "----" (flatfile_sample file) and also make another csv file from this.

Can you please help me out on this.

Thanks in advance.
# 2  
Old 03-02-2018
No attempts / ideas / thoughts from your side?


try
Code:
awk '/----\r$/ && $1=$1' OFS=, /tmp/sample_linebyline.txt 
840-1,1,ABCD,0010211-00,0012345678/012345678912,123456789012,2745.25-,----
840-1,1,ABCD,0010211-00,0012345678/012345678912,123456789012,2745.25,----

taking into account that your file has non-*nix but DOS line teminators.

EDIT: And for your other file, which has several trailing spaces in the lines, try
Code:
awk '/---- *\r$/ && $1=$1' OFS=, /tmp/flatfile_sample.txt 
840-1,1,ABCD,0010211-00,0012345678/012345678912,123456789012,2745.25-,----,
840-1,1,ABCD,0010211-00,0012345678/012345678912,123456789012,2745.25,----,


Last edited by RudiC; 03-02-2018 at 02:10 AM..
This User Gave Thanks to RudiC For This Post:
# 3  
Old 03-02-2018
Save as convert.py
Run as python3 convert.py
Code:
with open('sample_linebyline.txt') as rf, open('sample_linebyline.csv', 'w') as wf:
    for line in rf:
        if line.endswith('----\n'):
            fields = line.split()
            print(",".join(fields), file=wf)

with open('flatfile_sample.txt') as rf, open('flatfile_sample.csv', 'w') as wf:
    for line in rf:
        if '----' in line:
            fields = line.split()
            print(",".join(fields), file=wf)

Output:
Code:
$ cat sample_linebyline.csv
840-1,1,ABCD,0010211-00,0012345678/012345678912,123456789012,2745.25-,----
840-1,1,ABCD,0010211-00,0012345678/012345678912,123456789012,2745.25,----

Code:
$ cat flatfile_sample.csv
840-1,1,ABCD,0010211-00,0012345678/012345678912,123456789012,2745.25-,----
840-1,1,ABCD,0010211-00,0012345678/012345678912,123456789012,2745.25,----

This User Gave Thanks to Aia For This Post:
# 4  
Old 03-02-2018
Thank you RudiC and Aia for your help(actually Python is not installed in our system otherwise would have tried your option).
I have tried the below awk command from my end and worked for datafile containing line by line record and it works.
Code:
awk 'BEGIN {OFS=","} {print $1,$2,$3,$4,$5,$6,$7,$8}' D:/tmp/file > tmp.txt
awk '/----/' D:/tmp/tmp.txt >tmp1.txt

But I am having issue reading the flat file(a single line file) which throws error as:
awk: line 0 (NR=0): line too long: limit 20000

Could you please help/suggest me on this.
Thanks
# 5  
Old 03-02-2018
For the sample attached in post#1, the file command returns
Code:
file /tmp/flatfile_sample.txt 
/tmp/flatfile_sample.txt: ASCII text, with very long lines, with CRLF line terminators

, and line length ist between 1155 and 1159.

How does this differ from your real data file? What's your OS and tools' versions? Why that tedious awk conversion from D:/tmp/file to an intermediate file?
# 6  
Old 03-02-2018
Hi Rudi,
All the data in the flatfile_sample.txt is in one single line ie:it has ten thousands of records in one single line instead of a normal file which will have multiple record lines.
issue over here is reading this one line from the file. We are using mks tool in windows for shell scripting.
i will be using your suggestion which you have used earlier post, i just mentioned which i tried out.

Thanks
# 7  
Old 03-02-2018
How then do you tell one record from the other? Are they fixed length? Or another, non-LF line terminator?
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Data extraction from .xml file

Hello, I'm attempting to extract 13 digit numbers beginning with 978 from a data file with the following command: awk '{ for(i=1;i<=NF;i++) if($i ~ /^978/) print $i; }' datafile > outfile This typically works. However, the new data file is an .xml file, and this command is no longer working... (6 Replies)
Discussion started by: palex
6 Replies

2. Shell Programming and Scripting

Compare 2 files of csv file and match column data and create a new csv file of them

Hi, I am newbie in shell script. I need your help to solve my problem. Firstly, I have 2 files of csv and i want to compare of the contents then the output will be written in a new csv file. File1: SourceFile,DateTimeOriginal /home/intannf/foto/IMG_0713.JPG,2015:02:17 11:14:07... (8 Replies)
Discussion started by: refrain
8 Replies

3. Shell Programming and Scripting

CSV file data extraction

Hi I am writing a shell script to parse a CSV file , in which i am facing a problem to separate the columns . Could some one help me with it. IN301330/00001 pvavan kumar limited xyz@ttccpp.com IN302148/00002 PRECIOUS SECURITIES (P) LTD viash@yahoo.co.in IN300239/00000 CENTRE india... (8 Replies)
Discussion started by: nanduri
8 Replies

4. Shell Programming and Scripting

Converting data for text file to csv

Gents Using the script attached (raw2csv). i use to create the file .csv.. The input file is called 201.raw. Kindly can you check if there is easy way to do it. The script works fine but takes a lot time to process Thanks for your help (8 Replies)
Discussion started by: jiam912
8 Replies

5. Shell Programming and Scripting

Data extraction from .txt file

Hey all, i´ve got the following problem: i´m aquiring data with an instrument and i get data in a .txt file. This is how the txt file looks like: Report of AU program poptau F1P=-49.986ppm F2P=-110.014ppm Target directory for serfile: D:/data/Spect500/nmr/Thoma/882 Linear... (17 Replies)
Discussion started by: expikx
17 Replies

6. Shell Programming and Scripting

FILE_ID extraction from file name and save it in CSV file after looping through each folders

FILE_ID extraction from file name and save it in CSV file after looping through each folders My files are located in UNIX Server, i want to extract file_id and file_name from each file .and save it in a CSV file. How do I do that? I have folders in unix environment, directory structure is... (15 Replies)
Discussion started by: princetd001
15 Replies

7. Shell Programming and Scripting

data extraction from a file

Hi Freinds, I have a file1.txt in the following format File1.txt I want to get 2 files from the above file filextra.txt should have the lines which are ending with "<" and remaining lines in the filecompare.txt file. Please help. (3 Replies)
Discussion started by: i150371485
3 Replies

8. Shell Programming and Scripting

Converting variable space width data into CSV data in bash

Hi All, I was wondering how I can convert each line in an input file where fields are separated by variable width spaces into a CSV file. Below is the scenario what I am looking for. My Input data in inputfile.txt 19 15657 15685 Sr2dReader 107.88 105.51... (4 Replies)
Discussion started by: vharsha
4 Replies

9. Shell Programming and Scripting

Converting txt file in csv

HI All, I have a text file memory.txt which has following values. Average: 822387 7346605 89.93 288845 4176593 2044589 51883 2.47 7600 i want to convert this file in csv format and i am using following command to do it. sed s/_/\./g <... (3 Replies)
Discussion started by: mkashif
3 Replies

10. Shell Programming and Scripting

Data Extraction From a File

Hi All, I have a requirement where I have to search the file with some text say "Exception". This exception word can be repeated for more then 10 times. Suppose the "Exception" word is repeated at line numbers say x=10, 50, 60, 120. Now I want to extract all the lines starting from x-5 to... (3 Replies)
Discussion started by: rrangaraju
3 Replies
Login or Register to Ask a Question