Sponsored Content
Top Forums Shell Programming and Scripting Extract certain columns from big data Post 302821475 by happypoker on Friday 14th of June 2013 02:48:16 PM
Old 06-14-2013
Display Extract certain columns from big data

The dataset I'm working on is about 450G, with about 7000 colums and 30,000,000 rows.
I want to extract about 2000 columns from the original file to form a new file.
I have the list of number of the columns I need, but don't know how to extract them.
Thanks!
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

How to cut some data from big file

How to cut data from big file my file around 30 gb I tried "head -50022172 filename > newfile.txt ,and tail -5454283 newfile.txt. It's slowy. afer that I tried sed -n '46467831,50022172p' filename > newfile.txt ,also slow Please recommend me , faster command to cut some data from... (4 Replies)
Discussion started by: almanto
4 Replies

2. Shell Programming and Scripting

Extract data based on match against one column data from a long list data

My input file: data_5 Ali 422 2.00E-45 102/253 140/253 24 data_3 Abu 202 60.00E-45 12/23 140/23 28 data_1 Ahmad 256 7.00E-45 120/235 140/235 22 data_4 Aman 365 8.00E-45 15/65 140/65 20 data_10 Jones 869 9.00E-45 65/253 140/253 18... (12 Replies)
Discussion started by: patrick87
12 Replies

3. Shell Programming and Scripting

Transpose columns to Rows : Big data

Hi, I did read a few posts on the subjects, tried out a few solutions, but did not solve my problem. https://www.unix.com/302121568-post11.html https://www.unix.com/shell-programming-scripting/137953-large-file-columns-into-rows-etc-4.html Please help. Problem very similar to the second link... (15 Replies)
Discussion started by: genehunter
15 Replies

4. Shell Programming and Scripting

Sort a big data file

Hello, I have a big data file (160 MB) full of records with pipe(|) delimited those fields. I`m sorting the file on the first field. I'm trying to sort with "sort" command and it brings me 6 minutes. I have tried with some transformation methods in perl but it results "Out of memory". I was... (2 Replies)
Discussion started by: rubber08
2 Replies

5. Red Hat

Linux in Big Data projects

Hey guys, we will be interested in learning from your experience in using Linux in Big Data projects. Has anyone used Hadoop, or MapR or Horton Works on Linux and any experiences you may have had on these. I am more interested in knowing if a certain distribution of Linux is better supported for... (1 Reply)
Discussion started by: johnsmith111
1 Replies

6. Shell Programming and Scripting

Extract certain entries from big file:Request to check

Hi all I have a big file which I have attached here. And, I have to fetch certain entries and arrange in 5 columns Name Drug DAP ID disease approved or notIn the attached file data is arranged with tab separated columns in this way: and other data is... (2 Replies)
Discussion started by: manigrover
2 Replies

7. What is on Your Mind?

Big Data for System Admins

Hello, I have been working as Solaris/Linux Admin since past 8 years. I am looking options for my profile change, but there is some limitation. I worked as 24x7 support for admin, server support, high availability, etc. But been worked on developing side and scripting part. When I search for Big... (2 Replies)
Discussion started by: nightup2222
2 Replies

8. Shell Programming and Scripting

Compare 2 csv files by columns, then extract certain columns of matcing rows

Hi all, I'm pretty much a newbie to UNIX. I would appreciate any help with UNIX coding on comparing two large csv files (greater than 10 GB in size), and output a file with matching columns. I want to compare file1 and file2 by 'id' and 'chain' columns, then extract exact matching rows'... (5 Replies)
Discussion started by: bkane3
5 Replies

9. Shell Programming and Scripting

Want to extract certain lines from big file

Hi All, I am trying to get some lines from a file i did it with while-do-loop. since the files are huge it is taking much time. now i want to make it faster. The requirement is the file will be having 1 million lines. The format is like below. ##transaction, , , ,blah, blah... (38 Replies)
Discussion started by: mad man
38 Replies

10. Shell Programming and Scripting

Extract Big and continuous regions

Hi all, I have a file like this I want to extract only those regions which are big and continous chr1 3280000 3440000 chr1 3440000 3920000 chr1 3600000 3920000 # region coming within the 3440000 3920000. so i don't want it to be printed in output chr1 3920000 4800000 chr1 ... (2 Replies)
Discussion started by: amrutha_sastry
2 Replies
pfscut(1)						      General Commands Manual							 pfscut(1)

NAME
pfscut - Extract a rectangle out of a frame in PFS stream SYNOPSIS
pfscut [--left <columns>] [--right <columns>] [--top <rows>] [--bottom <rows>] [--width <new_width>] [--height <new_height>] [--help] [x_ul y_ul x_br y_br] DESCRIPTION
Extract a rectangle out of each frame in PFS stream. You can either specify x and y coordinates of upper left and lower right corner (the coordinates start with 0 and rise in the left-to-right and up-to-botton directions) or give a combination of the options listed below. OPTIONS
--left <columns>, -l <columns> Number of columns to be cut out from the left edge of an image. --right <columns>, -r <columns> Number of columns to be cut out from the right edge of an image. --top <rows>, -t <rows> Number of rows to be cut out from the top edge of an image. --bottom <rows>, -b <rows> Number of rows to be cut out from the bottom edge of an image. --width <new_width>, -W <new_width> Width of an output image. Note that --width can be mixed with either --left or --right option. --height <new_height>, -H <new_height> Height of an output image. Note that --height can be mixed with either --top or --bottom option. --help, -h Print a list of commandline options. EXAMPLES
pfsin image.hdr | pfscut --left 20 --top 5 | pfsout out.hdr Cut out 20 columns from the left and 5 rows from the top edge of image.hdr and save frame as out.hdr. pfsin image.hdr | pfscut --left 20 --width 400 | pfsout out.hdr Cut out 20 columns from the left edge of image.hdr, and create output image 400 pixels in width. pfsin image.hdr | pfscut 0 0 511 511 | pfsout out.hdr Cut left-upper part of the image of the size 512x512 (note that coordinates start with 0 and 512 is the last row/column that is included in the resulting image). SEE ALSO pfsin(1) pfsout(1) BUGS
Please report bugs and comments to Dorota Zdrojewska <dzdrojewska@wi.ps.pl>. pfscut(1)
All times are GMT -4. The time now is 05:29 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy