Sponsored Content
Top Forums Shell Programming and Scripting Awk and duplicate lines - little complicated Post 302606405 by shadowww on Sunday 11th of March 2012 09:04:19 AM
Old 03-11-2012
Question Awk and duplicate lines - little complicated

So I've got problem which continues on my previous one (from few months ago:
unix.com/shell-programming-scripting/171764-delete-duplicate-lines-twist.html ).

Good, proven, working solutions for that old problem are those:
Code:
awk '{cur=$0; gsub(/[^[:alnum:]]/, "", cur); if (!a[tolower(cur)]++) print}'

and
Code:
awk '{s=tolower($0);gsub("[^[:alnum:]]","",s);x[s]=$0} END {for(i in x) print x[i]}'

These 2 approaches yield same results (but with different final order of lines, which is really unimportant for me).
These lines (any of them) are also, what I need modified now to work a little different, and that is purpose of this new topic:

I now don't need awk (in his search for duplicate lines in file) to consider and compare whole lines anymore. But only first parts of lines until it reaches character '*' (asterisk). Asterisk is separator in my file and everything that comes after asterisk, awk should not bother with (its like he got to end of the line). Asterisk occurs in every line in file but sometimes there is more then one per line (this should not confuse awk, and he should still take into account only first part of line, until first asterisk appears.

If someone can make good solution for this would save me week of work... also eternal gratitude from me Smilie
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Print duplicate only lines as normal output - Awk

input output a1 100 200 XYZ_X a1 98 188 ABC (2 Replies)
Discussion started by: quincyjones
2 Replies

2. Shell Programming and Scripting

Command to remove duplicate lines with perl,sed,awk

Input: hello hello hello hello monkey donkey hello hello drink dance drink Output should be: hello hello monkey donkey drink dance (9 Replies)
Discussion started by: cola
9 Replies

3. Shell Programming and Scripting

Awk: How to merge duplicate lines and print in a single

The input file: >cat module1 200611051053 95 200523457498 35 200617890187 57 200726098123 66 200645676712 71 200744556590 68 >cat module2 200645676712 ... (10 Replies)
Discussion started by: winter9
10 Replies

4. Shell Programming and Scripting

remove duplicate lines using awk

Hi, I came to know that using awk '!x++' removes the duplicate lines. Can anyone please explain the above syntax. I want to understand how the above awk syntax removes the duplicates. Thanks in advance, sudvishw :confused: (7 Replies)
Discussion started by: sudvishw
7 Replies

5. Shell Programming and Scripting

AWK Duplicate lines multiple times based on a calculated value

Hi, I'm trying to create an XML sitemap of our dynamic ecommerce sites SEO Friendly URLs and am trying to create the initial page listing. I have a CSV file that looks like the following and need duplicate the lines based on a value which needs calculating. ... (2 Replies)
Discussion started by: jamesfx
2 Replies

6. Shell Programming and Scripting

[uniq + awk?] How to remove duplicate blocks of lines in files?

Hello again, I am wanting to remove all duplicate blocks of XML code in a file. This is an example: input: <string-array name="threeItems"> <item>item1</item> <item>item2</item> <item>item3</item> </string-array> <string-array name="twoItems"> <item>item1</item> <item>item2</item>... (19 Replies)
Discussion started by: raidzero
19 Replies

7. Shell Programming and Scripting

AWK Command to duplicate lines in a file?

Hi, I have a file with date in it like: UserString1 UserString2 UserString3 UserString4 UserString5 I need two entries for each line so it reads like UserString1 UserString1 UserString2 UserString2 etc. Can someone help me with the awk command please? Thanks (4 Replies)
Discussion started by: Grueben
4 Replies

8. UNIX for Dummies Questions & Answers

awk to sum column field from duplicate row/lines

Hello, I am new to Linux environment , I working on Linux script which should send auto email based on the specific condition from log file. Below is the sample log file Name m/c usage abc xxx 10 abc xxx 20 abc xxx 5 xyz ... (6 Replies)
Discussion started by: asjaiswal
6 Replies

9. UNIX for Dummies Questions & Answers

awk solution to duplicate lines based on column

Hi experts, I have a tab-delimited file with one column containing values separated by a comma. I wish to duplicate the entire line for every value in that comma-delimited field. For example: $cat file 4444 4444 4444 4444 9990 2222,7777 6666 2222 ... (3 Replies)
Discussion started by: torchij
3 Replies

10. Shell Programming and Scripting

How to put the command to remove duplicate lines in my awk script?

I create a CGI in bash/html. My awk script looks like : echo "<table>" for fn in /var/www/cgi-bin/LPAR_MAP/*; do echo "<td>" echo "<PRE>" awk -F',|;' -v test="$test" ' NR==1 { split(FILENAME ,a,""); } $0 ~ test { if(!header++){ ... (12 Replies)
Discussion started by: Tim2424
12 Replies
PYP(1)							      General Commands Manual							    PYP(1)

NAME
pyp - The Pyed Piper: A Modern Python Alternative to awk, sed and Other Unix Text Manipulation Utilities SYNOPSIS
pyp [options] files ... DESCRIPTION
pyp, the Pyed Piper, is a command line tool for text manipulation. It is similar to awk and sed in functionality, but its subcommands are Python based, and thus more familiar to many programmers. It can operate both on a per-line base and on the complete input stream. Different features can be pipelined in a single command by using the pipe character familiar from shell commands. pyp backs up its input for reruns with modified commands, and can save commands as macros. On the downside, the rerun feature makes it unsuitable for continuous pipe operation. OPTIONS
These programs follow the usual GNU command line syntax, with long options starting with two dashes (`-'). A summary of options is included below. For a complete description, use --manual. -h, --help Show this help message and exit. -m, --manual Prints out extended help. -l, --macro_list Lists all available macros. -s MACRO_SAVE_NAME, --macro_save=MACRO_SAVE_NAME Saves current command as macro. use "#" for adding comments EXAMPLE: pyp -s "great_macro # prints first letter" "p[1]". -f MACRO_FIND_NAME, --macro_find=MACRO_FIND_NAME Searches for macros with keyword or user name. -d MACRO_DELETE_NAME, --macro_delete=MACRO_DELETE_NAME Deletes specified public macro. -g, --macro_group Specify group macros for save and delete; default is user. -t TEXT_FILE, --text_file=TEXT_FILE Specify text file to load. For advanced users, you should typically cat a file into pyp. -x, --execute Execute all commands. -c, --turn_off_color Prints raw, uncolored output. -u, --unmodified_config Prints out generic PypCustom.py config file. -b BLANK_INPUTS, --blank_inputs=BLANK_INPUTS Generate this number of blank input lines; useful for generating numbered lists with variable 'n'. -n, --no_input Use with command that generates output with no input; same as --dummy_input 1. -k, --keep_false Print blank lines for lines that test as False. default is to filter out False lines from the output. -r, --rerun Rerun based on automatically cached data from the last run. Use this after executing "pyp", pasting input into the shell, and hitting CTRL-D. SEE ALSO
awk(1), grep(1), sed(1). AUTHOR
pyp was written by Toby Rosen <tobyrosen@gmail.com>. This manual page was written by Khalid El Fathi <khalid@elfathi.fr>, for the Debian project (and may be used by others). March 19, 2012 PYP(1)
All times are GMT -4. The time now is 04:12 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy