Sponsored Content
Top Forums Shell Programming and Scripting How to target certain delimiter to split text file? Post 302947972 by bakunin on Wednesday 24th of June 2015 09:37:55 AM
Old 06-24-2015
Quote:
Originally Posted by huiyee1
This is not an assignment. I am learning linux by myself. I thought that I might face the similar situation in the future. I have come out with a few solutions. But they are rather complicated.
This is OK. We want to help people help themselves. This is why we ask for what they have done - even if didn't work - to show them where they have gone wrong.

Further, we have a special forum for "Homework and Coursework" because we do help students alike. The difference is that special rules apply there and we (try to) help in a different way so that the stdent takes the most education out of our help. This was the background of Don Craguns and my questions.

Quote:
Originally Posted by huiyee1
To generate the first output file:
Code:
cat input | rev | cut -d"_" -f1 | rev > last_field  #this generates file containing the last field

Notice that you do not need "cat" to generate a stream usually. If you look at the man page of "rev" you will notice (this is taken from an AIX man page, yours might look slightly different):

Code:
rev Command

Purpose

       Reverses characters in each line of a file.

Syntax

       rev [ File ... ]

This means the following two lines do the same, but the second one uses one command ("cat") less, which is why it is preferable:

Code:
cat /path/to/file | rev
rev /path/to/file

When you look up "useless use of cat" on the internet you will find many more examples for the same error, because it is a very common one, which made it part of the "UNIX culture".


Quote:
Originally Posted by huiyee1
Is there any improved one-liner commands to generate the above output files?
As a matter of fact there are: you might want to learn a bit of sed (see "man sed" for help) and look around here in the forum. Here a link to some introductory article:

Regular expression introduction

sed ("stream editor") is a non-interactive text editor or, looking at it differently, a programmable text manipulation program. The most basic procedure for this is to look out for some pattern in a text and then manipulate it (delete or add parts, etc.).

Here is a simple sed program:

Code:
sed 's/abc/def/' /path/to/input > /path/to/output

It takes a file "/path/to/input", executes the program "s/abc/def/" on it and writes the result to file "/path/to/output". The program itself does a "substitution" ("s") of a fixed string "abc" by a fixed string "def". This replacement is done in every line once - for the first occurrence of "abc". It is possible to replace every occurrence instead by adding a "g" (global) to the end of the command:

Code:
sed 's/abc/def/g' /path/to/input > /path/to/output

It should be easy to see how you could do the text manipulation you have in mind with such a substitution, given that you craft the search- and substitution patterns correctly. Since your intention is to learn UNIX i won't tell you outright what the solution is. You might want to try yourself. If you have further questions feel free to ask.

I hope this helps.

bakunin
 

9 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

extract fields from text file using delimiter!!

Hi All, I am new to unix scripting, please help me in solving this assignment.. I have a scenario, as follows: 1. i have a text file(read1.txt) with the following data sairam,123 kamal,122 etc.. 2. I have to write a unix... (6 Replies)
Discussion started by: G.K.K
6 Replies

2. Shell Programming and Scripting

Adding a delimiter to a text file

Im writing a KSH script to read a simple text file and add a delimiter. Ive written the following script but it runs very slow. I initially used the cut command to substring the input record then switched to this version using awk to substring... both run too slow. Any ideas how to make this more... (2 Replies)
Discussion started by: lock
2 Replies

3. Shell Programming and Scripting

Help- counting delimiter in a huge file and split data into 2 files

I’m new to Linux script and not sure how to filter out bad records from huge flat files (over 1.3GB each). The delimiter is a semi colon “;” Here is the sample of 5 lines in the file: Name1;phone1;address1;city1;state1;zipcode1 Name2;phone2;address2;city2;state2;zipcode2;comment... (7 Replies)
Discussion started by: lv99
7 Replies

4. Shell Programming and Scripting

split file by delimiter with csplit

Hello, I want to split a big file into smaller ones with certain "counts". I am aware this type of job has been asked quite often, but I posted again when I came to csplit, which may be simpler to solve the problem. Input file (fasta format): >seq1 agtcagtc agtcagtc ag >seq2 agtcagtcagtc... (8 Replies)
Discussion started by: yifangt
8 Replies

5. Shell Programming and Scripting

Shell script to put delimiter for a no delimiter variable length text file

Hi, I have a No Delimiter variable length text file with following schema - Column Name Data length Firstname 5 Lastname 5 age 3 phoneno1 10 phoneno2 10 phoneno3 10 sample data - ... (16 Replies)
Discussion started by: Gaurav Martha
16 Replies

6. Shell Programming and Scripting

Split file into multiple files using delimiter

Hi, I have a file which has many URLs delimited by space. Now i want them to move to separate files each one holding 10 URLs per file. http://3276.e-printphoto.co.uk/guardian http://abdera.apache.org/ http://abdera.apache.org/docs/api/index.html I have used the below code to arrange... (6 Replies)
Discussion started by: vel4ever
6 Replies

7. Shell Programming and Scripting

Split a text file into multiple text files?

I have a text file with entries like 1186 5556 90844 7873 7722 12 7890.6 78.52 6679 3455 9867 1127 5642 ..N so many records like this. I want to split this file into multiple files like cluster1.txt, cluster2.txt, cluster3.txt, ..... clusterN.txt. (4 Replies)
Discussion started by: sammy777
4 Replies

8. UNIX for Advanced & Expert Users

How to split large file with different record delimiter?

Hi, I have received a file which is 20 GB. We would like to split the file into 4 equal parts and process it to avoid memory issues. If the record delimiter is unix new line, I could use split command either with option l or b. The problem is that the line terminator is |##| How to use... (5 Replies)
Discussion started by: Ravi.K
5 Replies

9. UNIX for Beginners Questions & Answers

Shell script to Split matrix file with delimiter into multiple files

I have a large semicolon delimited file with thousands of columns and many thousands of line. It looks like: ID1;ID2;ID3;ID4;A_1;B_1;C_1;A_2;B_2;C_2;A_3;B_3;C_3 AA;ax;ay;az;01;02;03;04;05;06;07;08;09 BB;bx;by;bz;03;05;33;44;15;26;27;08;09 I want to split this table in to multiple files: ... (1 Reply)
Discussion started by: trymega
1 Replies
All times are GMT -4. The time now is 01:23 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy