Extract certain columns from big data Post: 302821487

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

How to cut some data from big file

How to cut data from big file my file around 30 gb I tried "head -50022172 filename > newfile.txt ,and tail -5454283 newfile.txt. It's slowy. afer that I tried sed -n '46467831,50022172p' filename > newfile.txt ,also slow Please recommend me , faster command to cut some data from...

2. Shell Programming and Scripting

Extract data based on match against one column data from a long list data

My input file: data_5 Ali 422 2.00E-45 102/253 140/253 24 data_3 Abu 202 60.00E-45 12/23 140/23 28 data_1 Ahmad 256 7.00E-45 120/235 140/235 22 data_4 Aman 365 8.00E-45 15/65 140/65 20 data_10 Jones 869 9.00E-45 65/253 140/253 18...

3. Shell Programming and Scripting

Transpose columns to Rows : Big data

Hi, I did read a few posts on the subjects, tried out a few solutions, but did not solve my problem. https://www.unix.com/302121568-post11.html https://www.unix.com/shell-programming-scripting/137953-large-file-columns-into-rows-etc-4.html Please help. Problem very similar to the second link...

4. Shell Programming and Scripting

Sort a big data file

Hello, I have a big data file (160 MB) full of records with pipe(|) delimited those fields. I`m sorting the file on the first field. I'm trying to sort with "sort" command and it brings me 6 minutes. I have tried with some transformation methods in perl but it results "Out of memory". I was...

5. Red Hat

Linux in Big Data projects

Hey guys, we will be interested in learning from your experience in using Linux in Big Data projects. Has anyone used Hadoop, or MapR or Horton Works on Linux and any experiences you may have had on these. I am more interested in knowing if a certain distribution of Linux is better supported for...

6. Shell Programming and Scripting

Extract certain entries from big file:Request to check

Hi all I have a big file which I have attached here. And, I have to fetch certain entries and arrange in 5 columns Name Drug DAP ID disease approved or notIn the attached file data is arranged with tab separated columns in this way: and other data is...

7. What is on Your Mind?

Big Data for System Admins

Hello, I have been working as Solaris/Linux Admin since past 8 years. I am looking options for my profile change, but there is some limitation. I worked as 24x7 support for admin, server support, high availability, etc. But been worked on developing side and scripting part. When I search for Big...

8. Shell Programming and Scripting

Compare 2 csv files by columns, then extract certain columns of matcing rows

Hi all, I'm pretty much a newbie to UNIX. I would appreciate any help with UNIX coding on comparing two large csv files (greater than 10 GB in size), and output a file with matching columns. I want to compare file1 and file2 by 'id' and 'chain' columns, then extract exact matching rows'...

9. Shell Programming and Scripting

Want to extract certain lines from big file

Hi All, I am trying to get some lines from a file i did it with while-do-loop. since the files are huge it is taking much time. now i want to make it faster. The requirement is the file will be having 1 million lines. The format is like below. ##transaction, , , ,blah, blah...

10. Shell Programming and Scripting

Extract Big and continuous regions

Hi all, I have a file like this I want to extract only those regions which are big and continous chr1 3280000 3440000 chr1 3440000 3920000 chr1 3600000 3920000 # region coming within the 3440000 3920000. so i don't want it to be printed in output chr1 3920000 4800000 chr1 ...

LEARN ABOUT DEBIAN

data::formvalidator::filters

Data::FormValidator::Filters(3pm)			User Contributed Perl Documentation			 Data::FormValidator::Filters(3pm)

NAME

       Data::FormValidator::Filters - Basic set of filters available in an Data::FormValidator profile.

SYNOPSIS

	   use Data::FormValidator;

	   %profile = (
	       filters => 'trim',
	       ...
	   );

	   my $results = Data::FormValidator->check(  \%data, \%profile );

DESCRIPTION

       These are the builtin filters which may be specified as a name in the filters, field_filters, and field_filter_regexp_map parameters of the
       input profile.

       Filters are applied as the first step of validation, possibly modifying a copy of the validation before any constraints are checked.

RECOMMENDED USE

       As a long time maintainer and user of Data::FormValidator, I recommend that filters be used with caution. They are immediately modifying
       the input provided, so the original data is lost. The few I recommend include "trim", which removes leading and trailing whitespace. I have
       this turned on by default by using CGI::Application::Plugin::ValidateRM. It's also generally safe to use the "lc" and "uc" filters if you
       need that kind of data transformation.

       Beyond simple filters, I recommend transforming the "valid" hash returned from validation if further changes are needed.

PROCEDURAL INTERFACE

       You may also call these functions directly through the procedural interface by either importing them directly or importing the whole
       :filters group. For example, if you want to access the trim function directly, you could either do:

	   use Data::FormValidator::Filters (qw/filter_trim/);
	   # or
	   use Data::FormValidator::Filters (qw/:filters/);

	   $string = filter_trim($string);

       Notice that when you call filters directly, you'll need to prefix the filter name with "filter_".

THE FILTERS

   FV_split
	 use Data::FormValidator::Filters qw(FV_split);

	 # Validate every e-mail in a comma separated list

	 field_filters => {
	    several_emails  => FV_split(qr/s*,s*/),

	    # Any pattern that can be used by the 'split' builtin works.
	    tab_sep_field   => FV_split('	'),
	 },
	 constraint_methods => {
	   several_emails => email(),
	 },

       With this filter, you can split a field into multiple values. The constraint for the field will then be applied to every value.

       This filter has a different naming convention because it is a higher-order function.  Rather than returning a value directly, it returns a
       code reference to a standard Data::FormValidator filter.

       After successfully being validated the values will appear as an arrayref.

   FV_replace
	 use Data::FormValidator::Filters qw(FV_replace);

	 field_filters => {
	    first_name	 => FV_replace(qr/Mark/,'Don'),
	 },

       FV_replace is a shorthand for writing simple find-and-replace filters.  The above filter would be translated to this:

	sub { my $v = shift; $v =~ s/Mark/Don/; $v }

       For more complex filters, just write your own.

   trim
       Remove white space at the front and end of the fields.

   strip
       Runs of white space are replaced by a single space.

   digit
       Remove non digits characters from the input.

   alphanum
       Remove non alphanumeric characters from the input.

   integer
       Extract from its input a valid integer number.

   pos_integer
       Extract from its input a valid positive integer number.

       Bugs: This filter won't extract "9" from "a9+", it will instead extract "9+"

   neg_integer
       Extract from its input a valid negative integer number.

       Bugs: This filter will currently filter the case of "a9-" to become "9-", which it should leave it alone.

   decimal
       Extract from its input a valid decimal number.

       Bugs: Given "1,000.23", it will currently return "1.000.23"

   pos_decimal
       Extract from its input a valid positive decimal number.

       Bugs: Given "1,000.23", it will currently return "1.000.23"

   neg_decimal
       Extract from its input a valid negative decimal number.

       Bugs: Given "1,000.23", it will currently return "1.000.23"

   dollars
       Extract from its input a valid number to express dollars like currency.

       Bugs: This filter won't currently remove trailing numbers like "1.234".

   phone
       Filters out characters which aren't valid for an phone number. (Only accept digits [0-9], space, comma, minus, parenthesis, period and
       pound [#].)

   sql_wildcard
       Transforms shell glob wildcard (*) to the SQL like wildcard (%).

   quotemeta
       Calls the quotemeta (quote non alphanumeric character) builtin on its input.

   lc
       Calls the lc (convert to lowercase) builtin on its input.

   uc
       Calls the uc (convert to uppercase) builtin on its input.

   ucfirst
       Calls the ucfirst (Uppercase first letter) builtin on its input.

SEE ALSO

       o
	    L<Data::FormValidator>

       o
	    L<Data::FormValidator::Constraints>

       o
	    L<Data::FormValidator::Filters::Image> - shrink incoming image uploads

AUTHOR

	Author:  Francis J. Lacoste <francis.lacoste@iNsu.COM>
	Maintainer: Mark Stosberg <mark@summersault.com>

COPYRIGHT

       Copyright (c) 1999,2000 iNsu Innovations Inc.  All rights reserved.

       This program is free software; you can redistribute it and/or modify it under the terms as perl itself.

perl v5.14.2							    2011-11-25					 Data::FormValidator::Filters(3pm)

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

How to cut some data from big file

Discussion started by: almanto

2. Shell Programming and Scripting

Extract data based on match against one column data from a long list data

Discussion started by: patrick87

3. Shell Programming and Scripting

Transpose columns to Rows : Big data

Discussion started by: genehunter

4. Shell Programming and Scripting

Sort a big data file

Discussion started by: rubber08