03-06-2005
split string with multibyte delimiter
Hi,
I need to split a string, either using awk or cut or basic unix commands (no programming) , with a multibyte charectar as a delimeter.
Ex:
abcd-efgh-ijkl
split by -efgh- to get two segments abcd & ijkl
Is it possible?
Thanks
A.H.S
10 More Discussions You Might Find Interesting
1. Shell Programming and Scripting
Does anyone know how will I make awk's split work with empty or null separator/delimiter?
echo ABCD | awk '{ split($0,arr,""); print arr; }'
I need output like:
A
B
C
D
I am under HP-UX (5 Replies)
Discussion started by: Orbix
5 Replies
2. Shell Programming and Scripting
I have a directory of files that I need to rename by splitting the first and second halves of the filenames using the delimiter "-O" and then renaming with the second half first, followed by two underscores and then the first half. For example, natfinal1995annvol1_14.pdf -O filenum-20639 will be... (2 Replies)
Discussion started by: swimulator
2 Replies
3. Shell Programming and Scripting
Hi;
I want to write a shell script that will split a string with no delimiter.
Basically the script will read a line from a file.
For example the line it read from the file contains:
99234523
These values are never the same but the length will always be 8.
How do i split this... (8 Replies)
Discussion started by: saint34
8 Replies
4. Shell Programming and Scripting
Hi,
My inputfile contains field separaer is ^.
12^inms^
13^fakdks^ssk^s3
23^avsd^
13^fakdks^ssk^a4
I wanted to print only 2 delimiter occurence i.e
12^inms^
23^avsd^ (4 Replies)
Discussion started by: Jairaj
4 Replies
5. Shell Programming and Scripting
Hello,
I want to split a big file into smaller ones with certain "counts". I am aware this type of job has been asked quite often, but I posted again when I came to csplit, which may be simpler to solve the problem.
Input file (fasta format):
>seq1
agtcagtc
agtcagtc
ag
>seq2
agtcagtcagtc... (8 Replies)
Discussion started by: yifangt
8 Replies
6. Shell Programming and Scripting
Hi,
I have a file which has many URLs delimited by space. Now i want them to move to separate files each one holding 10 URLs per file.
http://3276.e-printphoto.co.uk/guardian http://abdera.apache.org/ http://abdera.apache.org/docs/api/index.html
I have used the below code to arrange... (6 Replies)
Discussion started by: vel4ever
6 Replies
7. Shell Programming and Scripting
Hi, all.
I have an input file. I would like to generate 3 types of output files.
Input:
LG10_PM_map_19_LEnd_1000560
LG10_PM_map_6-1_27101856
LG10_PM_map_71_REnd_20597718
LG12_PM_map_5_chr_118419232
LG13_PM_map_121_24341052
LG14_PM_1a_456799
LG1_MM_scf_5a_opt_abc_9029993
... (5 Replies)
Discussion started by: huiyee1
5 Replies
8. UNIX for Advanced & Expert Users
Hi,
I have received a file which is 20 GB. We would like to split the file into 4 equal parts and process it to avoid memory issues.
If the record delimiter is unix new line, I could use split command either with option l or b.
The problem is that the line terminator is |##|
How to use... (5 Replies)
Discussion started by: Ravi.K
5 Replies
9. Shell Programming and Scripting
I have a huge file (around 4-5 GB containing 20 million rows) which has text like:
<EOFD>11<EOFD>22<EORD>2<EOFD>2222<EOFD>3333<EORD>3<EOFD>44<EOFD>55<EORD>66<EOFD>888<EOFD>9999<EORD>
Actually above is an extracted file from a Sql Server with each field delimited by <EOFD> and each row ends... (8 Replies)
Discussion started by: amvip
8 Replies
10. Shell Programming and Scripting
I have a variable that contains the following string:
FPATH=-rw-rw-r-- 1 user1 dba 0 Aug 7 13:14 /app/F11.3/app/cust/exe/filename1.exe' -rw-rw-r-- 1 user1 dba 0 Aug 19 10:09 /app/app/F11.3/app/cust/sql/33211.sql' -rw-r--r-- 1 user1 dba 0 Aug 6 17:20 /app/F11.2/app/01/mrt/file1.mrt'
I... (7 Replies)
Discussion started by: mohtashims
7 Replies
LEARN ABOUT OPENDARWIN
cut
CUT(1) BSD General Commands Manual CUT(1)
NAME
cut -- select portions of each line of a file
SYNOPSIS
cut -b list [-n] [file ...]
cut -c list [file ...]
cut -f list [-d delim] [-s] [file ...]
DESCRIPTION
The cut utility selects portions of each line (as specified by list) from each file and writes them to the standard output. If no file argu-
ments are specified, or a file argument is a single dash ('-'), cut reads from from the standard input. The items specified by list can be
in terms of column position or in terms of fields delimited by a special character. Column numbering starts from 1.
The list option argument is a comma or whitespace separated set of increasing numbers and/or number ranges. Number ranges consist of a num-
ber, a dash ('-'), and a second number and select the fields or columns from the first number to the second, inclusive. Numbers or number
ranges may be preceded by a dash, which selects all fields or columns from 1 to the first number. Numbers or number ranges may be followed
by a dash, which selects all fields or columns from the last number to the end of the line. Numbers and number ranges may be repeated, over-
lapping, and in any order. It is not an error to select fields or columns not present in the input line.
The options are as follows:
-b list
The list specifies byte positions.
-c list
The list specifies character positions.
-d delim
Use the first character of delim as the field delimiter character instead of the tab character.
-f list
The list specifies fields, delimited in the input by a single tab character. Output fields are separated by a single tab character.
-n Do not split multi-byte characters.
-s Suppress lines with no field delimiter characters. Unless specified, lines with no delimiters are passed through unmodified.
ENVIRONMENT
The LANG, LC_ALL and LC_CTYPE environment variables affect the execution of cut if the -n option is specified. Their effect is described in
environ(7).
EXAMPLES
Extract users' login names and shells from the system passwd(5) file as ``name:shell'' pairs:
cut -d : -f 1,7 /etc/passwd
Show the names and login times of the currently logged in users:
who | cut -c 1-16,26-38
DIAGNOSTICS
The cut utility exits 0 on success, and >0 if an error occurs.
SEE ALSO
paste(1)
STANDARDS
The cut utility conforms to IEEE Std 1003.2-1992 (``POSIX.2'').
HISTORY
A cut command appeared in AT&T System III UNIX.
BUGS
The -c option is a synonym for the -b option, which causes incorrect behaviour in locales that support multibyte characters.
When operating on fields (-f option is specified), cut does not recognise multibyte characters, and the delim character is recognised in the
middle of multibyte sequences.
BSD
June 6, 1993 BSD