Passing multiple files to awk for processing in bash script


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Passing multiple files to awk for processing in bash script
# 8  
Old 06-16-2014
I imagine the hadoop ls output isn't just a list of filenames - you'd need to pre-process it to get just the names.

However, a quick look at a hadoop man page seems to imply that you can't access the files directly, so you would need a local copy of each to do it by file anyway.

If you're not using the filename (or other per-file processing) then you could cat the files into awk. Something like:
Code:
hadoop fs -cat /user/user/data/file* | awk '{stuff}'

(assuming hadoop cat doesn't add anything to the output, else you'd need to pre-process it)

You could also adapt Scrutinizer's suggestion if you need per-file processing.

Last edited by CarloM; 06-16-2014 at 11:21 AM..
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Loop through multiple files in bash script

Hi Everybody, I'm a newbie to shell scripting, and I'd appreciate some help. I have a bunch of .txt files that have some unwanted content. I want to remove lines 1-3 and 1028-1098. #!/bin/bash for '*.txt' in <path to folder> do sed '1,3 d' "$f"; sed '1028,1098 d' "$f"; done I... (2 Replies)
Discussion started by: BabyNuke
2 Replies

2. Shell Programming and Scripting

Plink (processing multiple commands) using Bash

I'm completely brand new to bash scripting (migrating from Windows batch file scripting). I'm currently trying to write a bash script that will automatically reset "error-disabled" Cisco switch ports. Please forgive the very crude and inefficient script I have so far (shown below). It is... (10 Replies)
Discussion started by: MKANET
10 Replies

3. Programming

awk processing / Shell Script Processing to remove columns text file

Hello, I extracted a list of files in a directory with the command ls . However this is not my computer, so the ls functionality has been revamped so that it gives the filesizes in front like this : This is the output of ls command : I stored the output in a file filelist 1.1M... (5 Replies)
Discussion started by: ajayram
5 Replies

4. Shell Programming and Scripting

Processing multiple files awk

hai i need my single awk script to act on 4 trace files of ns2 and to calculate througput and it should print result from each trace file in a single trace file. i tried with the following code but it doesnt work awk -f awkscript inputfile1 inputfile2 inputfile3 inputfile4>outputfile ... (4 Replies)
Discussion started by: sarathyy
4 Replies

5. Shell Programming and Scripting

Passing multiple files to awk

Hi all, I have a load of files in the format e.g. a_1.out a_300.out a_20.out etc I would like to numeric sort them in ascending order by the number in the file name, then pass them into awk for manipulation. How do I do this? (8 Replies)
Discussion started by: jimjam
8 Replies

6. Shell Programming and Scripting

Bash script to copy timestamps of multiple files

Hi, I have a bunch of media files in a directory that have been converted (from MTS to MOV format), so my directory contains something like this: clip1.mts clip1.mov clip2.mts clip2.mov The problem is that the .mov files that have been created have the timestamps of the conversion task,... (2 Replies)
Discussion started by: Krakus
2 Replies

7. Shell Programming and Scripting

bash script to compile multiple .c files with some options

I'm trying to write a bash script and call it "compile" such that running it allows me to compile multiple files with the options "-help," "-backup," and "-clean". I've got the code for the options written, i just can't figure out how to read the input string and then translate that into option... (5 Replies)
Discussion started by: travis.batzer
5 Replies

8. Shell Programming and Scripting

awk script processing data from 2 files

Hi! I have 2 files containing data that I need to process at the same time, I have problems in reading a different number of lines from the different files. Here is an explanation of what I need to do (possibly with an awk script). File "samples.txt" contains data in the format: time_instant... (6 Replies)
Discussion started by: Alice236
6 Replies

9. UNIX for Dummies Questions & Answers

single output of awk script processing multiple files

Helllo UNIX Forum :) Since I am posting on this board, yes, I am new to UNIX! I read a copy of "UNIX made easy" from 1990, which felt like a making a "computer-science time jump" backwards ;) So, basically I have some sort of understanding what the basic concept is. Problem Description:... (6 Replies)
Discussion started by: Kasimir
6 Replies

10. Shell Programming and Scripting

How to write bash script to explode multiple zip files

I have a directory full of zip files. How would I write a bash script to enumerate all the zip files, remove the ".zip" from the file name, create a directory by that name and unzip each zip file into its corresponding directory? Thanks! Siegfried (3 Replies)
Discussion started by: siegfried
3 Replies
Login or Register to Ask a Question
IGAWK(1)							 Utility Commands							  IGAWK(1)

NAME
igawk - gawk with include files SYNOPSIS
igawk [ all gawk options ] -f program-file [ -- ] file ... igawk [ all gawk options ] [ -- ] program-text file ... DESCRIPTION
Igawk is a simple shell script that adds the ability to have ``include files'' to gawk(1). AWK programs for igawk are the same as for gawk, except that, in addition, you may have lines like @include getopt.awk in your program to include the file getopt.awk from either the current directory or one of the other directories in the search path. OPTIONS
See gawk(1) for a full description of the AWK language and the options that gawk supports. EXAMPLES
cat << EOF > test.awk @include getopt.awk BEGIN { while (getopt(ARGC, ARGV, "am:q") != -1) ... } EOF igawk -f test.awk SEE ALSO
gawk(1) Effective AWK Programming, Edition 1.0, published by the Free Software Foundation, 1995. AUTHOR
Arnold Robbins (arnold@skeeve.com). Free Software Foundation Nov 3 1999 IGAWK(1)