Passing multiple files to awk for processing in bash script


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Passing multiple files to awk for processing in bash script
# 8  
Old 06-16-2014
I imagine the hadoop ls output isn't just a list of filenames - you'd need to pre-process it to get just the names.

However, a quick look at a hadoop man page seems to imply that you can't access the files directly, so you would need a local copy of each to do it by file anyway.

If you're not using the filename (or other per-file processing) then you could cat the files into awk. Something like:
Code:
hadoop fs -cat /user/user/data/file* | awk '{stuff}'

(assuming hadoop cat doesn't add anything to the output, else you'd need to pre-process it)

You could also adapt Scrutinizer's suggestion if you need per-file processing.

Last edited by CarloM; 06-16-2014 at 11:21 AM..
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Loop through multiple files in bash script

Hi Everybody, I'm a newbie to shell scripting, and I'd appreciate some help. I have a bunch of .txt files that have some unwanted content. I want to remove lines 1-3 and 1028-1098. #!/bin/bash for '*.txt' in <path to folder> do sed '1,3 d' "$f"; sed '1028,1098 d' "$f"; done I... (2 Replies)
Discussion started by: BabyNuke
2 Replies

2. Shell Programming and Scripting

Plink (processing multiple commands) using Bash

I'm completely brand new to bash scripting (migrating from Windows batch file scripting). I'm currently trying to write a bash script that will automatically reset "error-disabled" Cisco switch ports. Please forgive the very crude and inefficient script I have so far (shown below). It is... (10 Replies)
Discussion started by: MKANET
10 Replies

3. Programming

awk processing / Shell Script Processing to remove columns text file

Hello, I extracted a list of files in a directory with the command ls . However this is not my computer, so the ls functionality has been revamped so that it gives the filesizes in front like this : This is the output of ls command : I stored the output in a file filelist 1.1M... (5 Replies)
Discussion started by: ajayram
5 Replies

4. Shell Programming and Scripting

Processing multiple files awk

hai i need my single awk script to act on 4 trace files of ns2 and to calculate througput and it should print result from each trace file in a single trace file. i tried with the following code but it doesnt work awk -f awkscript inputfile1 inputfile2 inputfile3 inputfile4>outputfile ... (4 Replies)
Discussion started by: sarathyy
4 Replies

5. Shell Programming and Scripting

Passing multiple files to awk

Hi all, I have a load of files in the format e.g. a_1.out a_300.out a_20.out etc I would like to numeric sort them in ascending order by the number in the file name, then pass them into awk for manipulation. How do I do this? (8 Replies)
Discussion started by: jimjam
8 Replies

6. Shell Programming and Scripting

Bash script to copy timestamps of multiple files

Hi, I have a bunch of media files in a directory that have been converted (from MTS to MOV format), so my directory contains something like this: clip1.mts clip1.mov clip2.mts clip2.mov The problem is that the .mov files that have been created have the timestamps of the conversion task,... (2 Replies)
Discussion started by: Krakus
2 Replies

7. Shell Programming and Scripting

bash script to compile multiple .c files with some options

I'm trying to write a bash script and call it "compile" such that running it allows me to compile multiple files with the options "-help," "-backup," and "-clean". I've got the code for the options written, i just can't figure out how to read the input string and then translate that into option... (5 Replies)
Discussion started by: travis.batzer
5 Replies

8. Shell Programming and Scripting

awk script processing data from 2 files

Hi! I have 2 files containing data that I need to process at the same time, I have problems in reading a different number of lines from the different files. Here is an explanation of what I need to do (possibly with an awk script). File "samples.txt" contains data in the format: time_instant... (6 Replies)
Discussion started by: Alice236
6 Replies

9. UNIX for Dummies Questions & Answers

single output of awk script processing multiple files

Helllo UNIX Forum :) Since I am posting on this board, yes, I am new to UNIX! I read a copy of "UNIX made easy" from 1990, which felt like a making a "computer-science time jump" backwards ;) So, basically I have some sort of understanding what the basic concept is. Problem Description:... (6 Replies)
Discussion started by: Kasimir
6 Replies

10. Shell Programming and Scripting

How to write bash script to explode multiple zip files

I have a directory full of zip files. How would I write a bash script to enumerate all the zip files, remove the ".zip" from the file name, create a directory by that name and unzip each zip file into its corresponding directory? Thanks! Siegfried (3 Replies)
Discussion started by: siegfried
3 Replies
Login or Register to Ask a Question
RBASH(1)						      General Commands Manual							  RBASH(1)

NAME
rbash - restricted bash, see bash(1) RESTRICTED SHELL
If bash is started with the name rbash, or the -r option is supplied at invocation, the shell becomes restricted. A restricted shell is used to set up an environment more controlled than the standard shell. It behaves identically to bash with the exception that the follow- ing are disallowed or not performed: o changing directories with cd o setting or unsetting the values of SHELL, PATH, ENV, or BASH_ENV o specifying command names containing / o specifying a file name containing a / as an argument to the . builtin command o specifying a filename containing a slash as an argument to the -p option to the hash builtin command o importing function definitions from the shell environment at startup o parsing the value of SHELLOPTS from the shell environment at startup o redirecting output using the >, >|, <>, >&, &>, and >> redirection operators o using the exec builtin command to replace the shell with another command o adding or deleting builtin commands with the -f and -d options to the enable builtin command o using the enable builtin command to enable disabled shell builtins o specifying the -p option to the command builtin command o turning off restricted mode with set +r or set +o restricted. These restrictions are enforced after any startup files are read. When a command that is found to be a shell script is executed, rbash turns off any restrictions in the shell spawned to execute the script. SEE ALSO
bash(1) GNU Bash-4.0 2004 Apr 20 RBASH(1)