Need to prepare a CSV table with inputs from multiple files


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Need to prepare a CSV table with inputs from multiple files
# 1  
Old 11-02-2019
Need to prepare a CSV table with inputs from multiple files

Hi,

We have thousands of Video files and we need to prepare a table with few of it's parameters/values from their Mediainfo output. Here I have 3 sample mediainfo output text files named Vid1, Vid2 & Vid3 so on... (such we have thousands) and each file has exactly 3 lines.

Code:
$ ls
Required_para.txt  Vid1  Vid2  Vid3

$ wc -l V*
   3 Vid1
   3 Vid2
   3 Vid3
   9 total

Code:
$ cat Vid1
Video : HEVC / Main 10@L4@Main / 5 750 kb/s / 1 920 pixels / 1 080 pixels 
Writing library : x265
Encoding settings                        : cpuid=1111039 / frame-threads=5 / numa-pools=32,0 / wpp / no-pmode / no-pme / no-psnr / no-ssim / log-level=2 / input-csp=1 / input-res=1920x816 / interlace=0 / total-frames=193561 / level-idc=0 / high-tier=1 / uhd-bd=0 / ref=5 / no-allow-non-conformance / no-repeat-headers / annexb / no-aud / no-hrd / info / hash=0 / no-temporal-layers / open-gop / min-keyint=23 / keyint=250 / gop-lookahead=0 / bframes=8 / b-adapt=2 / b-pyramid / bframe-bias=0 / rc-lookahead=100 / lookahead-slices=4 / scenecut=40 / radl=0 / no-splice / no-intra-refresh / ctu=32 / min-cu-size=8 / no-rect / no-amp / max-tu-size=16 / tu-inter-depth=1 / tu-intra-depth=1 / limit-tu=0 / rdoq-level=2 / dynamic-rd=0.00 / no-ssim-rd / signhide / no-tskip / nr-intra=0 / nr-inter=0 / no-constrained-intra / no-strong-intra-smoothing / max-merge=4 / limit-refs=3 / limit-modes / me=3 / subme=7 / merange=92 / temporal-mvp / weightp / weightb / no-analyze-src-pics / deblock=0:0 / no-sao / no-sao-non-deblock / rd=4 / no-early-skip / rskip / no-fast-intra / no-tskip-fast / no-cu-lossless / b-intra / no-splitrd-skip / rdpenalty=0 / psy-rd=2.00 / psy-rdoq=1.00 / no-rd-refine / no-lossless / cbqpoffs=0 / crqpoffs=0 / rc=crf / crf=23.3 / qcomp=0.60 / qpstep=4 / stats-write=0 / stats-read=0 / ipratio=1.40 / pbratio=1.30 / aq-mode=3 / aq-strength=1.00 / cutree / zone-count=0 / no-strict-cbr / qg-size=32 / no-rc-grain / qpmax=69 / qpmin=0 / no-const-vbv / sar=0 / overscan=0 / videoformat=5 / range=0 / colorprim=1 / transfer=1 / colormatrix=1 / chromaloc=0 / display-window=0cll=0,0 / min-luma=0 / max-luma=1023 / log2-max-poc-lsb=8 / vui-timing-info / vui-hrd-info / slices=1 / no-opt-qp-pps / no-opt-ref-list-length-pps / no-multi-pass-opt-rps / scenecut-bias=0.05 / no-opt-cu-delta-qp / no-aq-motion / no-hdr / no-hdr-opt / no-dhdr10-opt / no-idr-recovery-sei / analysis-reuse-level=8 / scale-factor=0 / refine-intra=0 / refine-inter=0 / refine-mv=0 / refine-ctu-distortion=0 / no-limit-sao / ctu-info=0 / no-lowpass-dct / refine-analysis-type=0 / copy-pic=1 / max-ausize-factor=1.0 / no-dynamic-refine / no-single-sei / no-hevc-aq / no-svt / no-field / qp-adaptation-range=1.00

$ cat Vid2
Video : HEVC / Main 10@L4@Main / 3 711 kb/s / 1 920 pixels / 804 pixels 
Writing library : x265
Encoding settings : cpuid=1111039 / frame-threads=3 / numa-pools=8 / wpp / no-pmode / no-pme / no-psnr / no-ssim / log-level=2 / input-csp=1 / input-res=1920x804 / interlace=0 / total-frames=155305 / level-idc=0 / high-tier=1 / uhd-bd=0 / ref=4 / no-allow-non-conformance / no-repeat-headers / annexb / no-aud / no-hrd / info / hash=0 / no-temporal-layers / open-gop / min-keyint=23 / keyint=250 / gop-lookahead=0 / bframes=8 / b-adapt=2 / b-pyramid / bframe-bias=0 / rc-lookahead=25 / lookahead-slices=4 / scenecut=40 / radl=0 / no-splice / no-intra-refresh / ctu=64 / min-cu-size=8 / rect / no-amp / max-tu-size=32 / tu-inter-depth=1 / tu-intra-depth=1 / limit-tu=0 / rdoq-level=2 / dynamic-rd=0.00 / no-ssim-rd / signhide / no-tskip / nr-intra=0 / nr-inter=0 / no-constrained-intra / strong-intra-smoothing / max-merge=3 / limit-refs=3 / limit-modes / me=3 / subme=3 / merange=57 / temporal-mvp / no-frame-dup / no-hme / weightp / no-weightb / no-analyze-src-pics / deblock=0:0 / no-sao / no-sao-non-deblock / rd=4 / selective-sao=0 / no-early-skip / rskip / no-fast-intra / no-tskip-fast / no-cu-lossless / no-b-intra / no-splitrd-skip / rdpenalty=0 / psy-rd=2.00 / psy-rdoq=1.00 / no-rd-refine / no-lossless / cbqpoffs=0 / crqpoffs=0 / rc=crf / crf=18.0 / qcomp=0.60 / qpstep=4 / stats-write=0 / stats-read=0 / ipratio=1.40 / pbratio=1.30 / aq-mode=3 / aq-strength=1.00 / cutree / zone-count=0 / no-strict-cbr / qg-size=32 / no-rc-grain / qpmax=69 / qpmin=0 / no-const-vbv / sar=0 / overscan=0 / videoformat=5 / range=0 / colorprim=2 / transfer=2 / colormatrix=2 / chromaloc=0 / display-window=0 / cll=0,0 / min-luma=0 / max-luma=1023 / log2-max-poc-lsb=8 / vui-timing-info / vui-hrd-info / slices=1 / no-opt-qp-pps / no-opt-ref-list-length-pps / no-multi-pass-opt-rps / scenecut-bias=0.05 / no-opt-cu-delta-qp / aq-motion / no-hdr / no-hdr-opt / no-dhdr10-opt / no-idr-recovery-sei / analysis-reuse-level=5 / scale-factor=0 / refine-intra=0 / refine-inter=0 / refine-mv=1 / refine-ctu-distortion=0 / no-limit-sao / ctu-info=0 / no-lowpass-dct / refine-analysis-type=0 / copy-pic=1 / max-ausize-factor=1.0 / no-dynamic-refine / no-single-sei / no-hevc-aq / no-svt / no-field / qp-adaptation-range=1.00

$ cat Vid3
Video : HEVC / Main 10@L4@Main / 7 903 kb/s / 1 920 pixels / 800 pixels 
Writing library : x265
Encoding settings : cpuid=1111039 / frame-threads=3 / numa-pools=8 / wpp / no-pmode / no-pme / no-psnr / no-ssim / log-level=2 / input-csp=1 / input-res=1920x800 / interlace=0 / total-frames=118448 / level-idc=0 / high-tier=1 / uhd-bd=0 / ref=4 / no-allow-non-conformance / no-repeat-headers / annexb / no-aud / no-hrd / info / hash=0 / no-temporal-layers / open-gop / min-keyint=23 / keyint=250 / gop-lookahead=0 / bframes=8 / b-adapt=2 / b-pyramid / bframe-bias=0 / rc-lookahead=25 / lookahead-slices=4 / scenecut=40 / radl=0 / no-splice / no-intra-refresh / ctu=64 / min-cu-size=8 / rect / no-amp / max-tu-size=32 / tu-inter-depth=1 / tu-intra-depth=1 / limit-tu=0 / rdoq-level=2 / dynamic-rd=0.00 / no-ssim-rd / signhide / no-tskip / nr-intra=0 / nr-inter=0 / no-constrained-intra / strong-intra-smoothing / max-merge=3 / limit-refs=3 / limit-modes / me=3 / subme=3 / merange=57 / temporal-mvp / no-frame-dup / no-hme / weightp / no-weightb / no-analyze-src-pics / deblock=0:0 / no-sao / no-sao-non-deblock / rd=4 / selective-sao=0 / no-early-skip / rskip / no-fast-intra / no-tskip-fast / no-cu-lossless / no-b-intra / no-splitrd-skip / rdpenalty=0 / psy-rd=2.00 / psy-rdoq=1.00 / no-rd-refine / no-lossless / cbqpoffs=0 / crqpoffs=0 / rc=crf / crf=18.0 / qcomp=0.60 / qpstep=4 / stats-write=0 / stats-read=0 / ipratio=1.40 / pbratio=1.30 / aq-mode=3 / aq-strength=1.00 / cutree / zone-count=0 / no-strict-cbr / qg-size=32 / no-rc-grain / qpmax=69 / qpmin=0 / no-const-vbv / sar=0 / overscan=0 / videoformat=5 / range=0 / colorprim=2 / transfer=2 / colormatrix=2 / chromaloc=0 / display-window=0 / cll=0,0 / min-luma=0 / max-luma=1023 / log2-max-poc-lsb=8 / vui-timing-info / vui-hrd-info / slices=1 / no-opt-qp-pps / no-opt-ref-list-length-pps / no-multi-pass-opt-rps / scenecut-bias=0.05 / no-opt-cu-delta-qp / aq-motion / no-hdr / no-hdr-opt / no-dhdr10-opt / no-idr-recovery-sei / analysis-reuse-level=5 / scale-factor=0 / refine-intra=0 / refine-inter=0 / refine-mv=1 / refine-ctu-distortion=0 / no-limit-sao / ctu-info=0 / no-lowpass-dct / refine-analysis-type=0 / copy-pic=1 / max-ausize-factor=1.0 / no-dynamic-refine / no-single-sei / no-hevc-aq / no-svt / no-field / qp-adaptation-range=1.00

The file Required_para.txt has list of parameters (and their values) to be extracted into the output file. We might change this file to add/remove parameters.

Code:
$ cat Required_para.txt
input-res
crf
aq-mode
strong-intra-smoothing
bframes
rc-lookahead
me
subme
merange
deblock
selective-sao
psy-rd
psy-rdoq
ctu
rdoq-level
max-merge
ref
max-tu-size
transfer
colormatrix
colorprim

The output's first line is the list of parameters/values required (comma separated). Each Vid's medianfo will fit in each row with their parameter's values. The 3rd column BitRate is the value of first line's "kb/s" entry of VidX file. 7th column strong-intra-smoothing value should be either 0 (if input has no-strong-intra-smoothing ) OR 1 (if input has strong-intra-smoothing ).

Output file should be like below. I filled-in only few columns.
Code:
$ cat Output_Sample
Sl#,VideoName,BitRate,input-res,crf,aq-mode,strong-intra-smoothing,bframes,rc-lookahead,me,subme,merange,deblock,selective-sao,psy-rd,psy-rdoq,ctu,rdoq-level,max-merge,ref,max-tu-size,transfer,colormatrix,colorprim
1,Vid1,5750,1920x816,23.3,3,0,8
2,Vid2,3711,1920x804,18.0,3,1,8
3,Vid3,7903,1920x804,18.0,3,1,8

$

Thank you!
# 2  
Old 11-03-2019
Hi prvnrk,

With 252 posts here you should know that UNIX.com is not a free script writing service.

You need to post your code what you wrote to accomplish your objective(s), the sample input and the required output, and any error messages.

You seemed to have done all of the above, EXCEPT you did not provide any code you wrote yourself to solve this problem.

Please, show your work and do your own work and ask for help. Do not ask others to do your work for you.
# 3  
Old 11-03-2019
Your desired output doesn't seem to match your requirements / parameters / heading?
# 4  
Old 11-03-2019
@Neo - Thanks, Always I tried and shared my attempts here (if you look at my previous posts) but this time also I tried and gave it up after 80% completion. Now I completed my script which I don't like (as always) as it's messy. This does my job but not perfect.

@RudiC - Thanks I missed BitRate field. now I fixed.

Please see my Script and other files, Appreciate a Better version. Thannks!!

Code:
$ cat Final_Format.sh
#!/usr/bin/bash
cd /home/admin/HEVC1
echo "Sl#,VideoName,BitRate,`cat Required_para.txt | paste -sd "," -`" >/home/admin/HEVC1/MedInfo_Output.csv
countt=0
for VidX in Vid*
do
        countt=`expr $countt + 1`
        echo "${countt}" >/tmp/Req_Values_Vertical.txt
        echo "${VidX}" >>/tmp/Req_Values_Vertical.txt
        VidBitRate=`grep "HEVC " ${VidX} |grep Video |awk -F"/" '{print $3}' |sed 's/[^0-9]*//g'`
        echo "${VidBitRate}" >>/tmp/Req_Values_Vertical.txt

        grep "Encoding settings" ${VidX} |sed 's/\//\n/g' >/tmp/All_Values.txt
        grep "Encoding settings" ${VidX} |sed 's/\//\n/g' |grep = >/tmp/All_Values2.txt

                for param in `cat Required_para.txt` ; do
                grep "${param}"= /tmp/All_Values2.txt|head -1 >>/tmp/Req_Values_Vertical.txt
                done

        sed -i.bak 's/ //g' /tmp/Req_Values_Vertical.txt

        if [ `grep no-strong-intra-smoothing ${VidX} |wc -l` -gt 0 ]; then
        sed -i.bakk '/^aq-mode=.*/a strong-intra-smoothing=0' /tmp/Req_Values_Vertical.txt
        else
        sed -i.bakk '/^aq-mode=.*/a strong-intra-smoothing=1' /tmp/Req_Values_Vertical.txt
        fi
		
		if [ `grep selective-sao ${VidX} |wc -l` -eq 0 ]; then
        sed -i.bakkk '/^colorprim=.*/a selective-sao=NA' /tmp/Req_Values_Vertical.txt
        fi

        awk -F"=" '{print $NF}' /tmp/Req_Values_Vertical.txt >/tmp/Req_ValuesOnly_Vertical.txt
        cat /tmp/Req_ValuesOnly_Vertical.txt | paste -sd "," - >/tmp/Req_ValuesOnly_Horizontal.txt
        cat /tmp/Req_ValuesOnly_Horizontal.txt >>/home/admin/HEVC1/MedInfo_Output.csv
done

Code:
$ pwd
/home/admin/HEVC1
$ ./Final_Format.sh
$ cat MedInfo_Output.csv
Sl#,VideoName,BitRate,input-res,crf,aq-mode,strong-intra-smoothing,bframes,rc-lookahead,me,subme,merange,deblock,psy-rd,psy-rdoq,ctu,rdoq-level,max-merge,ref,max-tu-size,transfer,colormatrix,colorprim,selective-sao
1,Vid1,5750,1920x816,23.3,3,0,8,100,3,7,92,0:0,2.00,1.00,32,2,4,5,16,1,1,1,NA
2,Vid2,3711,1920x804,18.0,3,1,8,25,3,3,57,0:0,2.00,1.00,64,2,3,4,32,2,2,2,0
3,Vid3,7903,1920x800,18.0,3,1,8,25,3,3,57,0:0,2.00,1.00,64,2,3,4,32,2,2,2,0
$

$
Code:
$ cat Required_para.txt
input-res
crf
aq-mode
strong-intra-smoothing
bframes
rc-lookahead
me
subme
merange
deblock
psy-rd
psy-rdoq
ctu
rdoq-level
max-merge
ref
max-tu-size
transfer
colormatrix
colorprim
selective-sao


Last edited by prvnrk; 11-03-2019 at 04:14 PM..
# 5  
Old 11-03-2019
Try
Code:
awk -F/ '
NR == 1         {printf "Sl#,VideoName,BitRate"
                }
FNR == NR       {PRM[NR] =  "( |no-)" $0 "=*[^ /]*"
                 MX = NR
                 printf ",%s", $0
                 next
                }
!HDFIN          {printf RS
                 HDFIN = 1
                }
FNR == 1        {gsub (" ", "", $3)
                 printf "%d,%s,%d", ++FCNT, FILENAME, $3+0
                }
FNR == 3        {for (i=1; i<=MX; i++)  {match ($0, PRM[i])
                                         if (2 == split (substr ($0, RSTART, RLENGTH), TMP, "=")) printf ",%s", TMP[2]
                                           else printf ",%d", (! (TMP[1] ~ /^no/))
                                        }
                 printf RS
                }

' Required_para.txt OFS=, vid?
Sl#,VideoName,BitRate,input-res,crf,aq-mode,strong-intra-smoothing,bframes,rc-lookahead,me,subme,merange,deblock,selective-sao,psy-rd,psy-rdoq,ctu,rdoq-level,max-merge,ref,max-tu-size,transfer,colormatrix,colorprim
1,vid1,5750,1920x816,23.3,3,0,8,100,3,7,92,0:0,1,2.00,1.00,32,2,4,5,16,1,1,1
2,vid2,3711,1920x804,18.0,3,1,8,25,3,3,57,0:0,0,2.00,1.00,64,2,3,4,32,2,2,2
3,vid3,7903,1920x800,18.0,3,1,8,25,3,3,57,0:0,0,2.00,1.00,64,2,3,4,32,2,2,2


Last edited by RudiC; 11-03-2019 at 05:20 PM..
This User Gave Thanks to RudiC For This Post:
# 6  
Old 11-03-2019
@RudiC - Many thanks!! you are a god of awk Smilie

The only issue with your script is that if MediaInfo of a video doesn't have entry of selective-sao, it displays 1 value for it. Could we set it to NA please?
Code:
$ grep selective-sao Vid1

$

Needless to say that your script runs million times faster than mine because of the efficient use of awk. I tried several times to learn awk in depth but never found it easy. Please suggest some material to learn it easy way.


Thanks again!
# 7  
Old 11-03-2019
NA was not specified. Try
Code:
awk -F/ '
NR == 1         {printf "Sl#,VideoName,BitRate"
                }
FNR == NR       {PRM[NR] =  "( |no-)" $0 "=*[^ /]*"
                 MX = NR
                 printf ",%s", $0
                 next
                }
!HDFIN          {printf RS
                 HDFIN = 1
                }
FNR == 1        {gsub (" ", "", $3)
                 printf "%d,%s,%d", ++FCNT, FILENAME, $3+0
                }
FNR == 3        {for (i=1; i<=MX; i++)  {match ($0, PRM[i])
                                         n = split (substr ($0, RSTART, RLENGTH), TMP, "=")
                                         if (n == 2)            printf ",%s", TMP[2]
                                           else if (n == 1)     printf ",%d", (! (TMP[1] ~ /^no/))
                                                  else          printf ",NA"
                                        }
                 printf RS
                }

' Required_para.txt OFS=, vid?
Sl#,VideoName,BitRate,input-res,crf,aq-mode,strong-intra-smoothing,bframes,rc-lookahead,me,subme,merange,deblock,selective-sao,psy-rd,psy-rdoq,ctu,rdoq-level,max-merge,ref,max-tu-size,transfer,colormatrix,colorprim,selective-sao
1,vid1,5750,1920x816,23.3,3,0,8,100,3,7,92,0:0,NA,2.00,1.00,32,2,4,5,16,1,1,1,NA
2,vid2,3711,1920x804,18.0,3,1,8,25,3,3,57,0:0,0,2.00,1.00,64,2,3,4,32,2,2,2,0
3,vid3,7903,1920x800,18.0,3,1,8,25,3,3,57,0:0,0,2.00,1.00,64,2,3,4,32,2,2,2,0


No easy way. Read and exercise. What I did to learn awk is solve problems e.g. found in these fora and analyse / understand solutions given, and use the man page. I have to admit I have a history in other programming languages so wasn't a true novice.
This User Gave Thanks to RudiC For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Export Oracle multiple tables to multiple csv files using UNIX shell scripting

Hello All, just wanted to export multiple tables from oracle sql using unix shell script to csv file and the below code is exporting only the first table. Can you please suggest why? or any better idea? export FILE="/abc/autom/file/geo_JOB.csv" Export= `sqlplus -s dev01/password@dEV3... (16 Replies)
Discussion started by: Hope
16 Replies

2. Shell Programming and Scripting

CSV joining and checking multiple files

Hello, For our work we use several scripts to gather/combine data for use in our webshop. Untill now we did not had any problems but since a couple days we noticed some mismatches between imports. It happened that several barcodes where matched even though it was a complete other product. Of... (19 Replies)
Discussion started by: SDohmen
19 Replies

3. UNIX for Dummies Questions & Answers

Load multiple files into a table

Hi, I need to load data from two files to a single table. My requirement is that I get two files in which a few column data are manadatory. These files are identified based on the file name. For example, I have two files ABCFile and BCDFile. ABCFile has mandatory data in column 3 and 4... (0 Replies)
Discussion started by: reshma15193
0 Replies

4. Shell Programming and Scripting

Insterting column in csv from multiple files

Hello, I have a spec file that contains a lot of strings that looks like this: PC DELL OptiPlex 3010MT i3 3220/2GB/500GB/DVD-RW/FREE DOS / 5Y NBD Intel i3 3220 (Dual Core, 3.30GHz, 3MB, w/ HD2500 Graphics), 2GB (1x2GB) DDR3 PC3-1600MHz, 500GB HDD SATA III 7200rpm, DVD+/-RW (16x),... (9 Replies)
Discussion started by: g9100
9 Replies

5. Shell Programming and Scripting

Split a .csv File into Multiple Files

Hi guys, I have a requirement where i need to split a .csv file into multiple files. Say for example i have data.csv file and i have splitted that into multiple files based on some conditions i.e first file should have 100, last file 50 and other files 1000 each. Am passing the values in... (2 Replies)
Discussion started by: azherkn3
2 Replies

6. Shell Programming and Scripting

Find values in multiple csv files

Hi, I'd like to find the values of certain fields in multiple csv files stored in 1 directory based upon an input search string. An fgrep returns the complete record, I only want certain fields. Thanks in advance for your help. Perry (6 Replies)
Discussion started by: biscayne
6 Replies

7. Shell Programming and Scripting

reading information from a table and apply a command on multiple files

Hey gyuz, I wanna calculate the number of mapped reads of a bam file in a region of interest. I used this code to do so : samtools view input.bam chrname:region1 > region1.txt This will store all the reads from given bam file within the region of interest in region1.txt Now I have... (5 Replies)
Discussion started by: @man
5 Replies

8. Shell Programming and Scripting

Field validations in multiple files CSV

Hi, I am regular reader of this forum. My advanced thanks to everyone. Below given are the sample files INDATA (Main data) Fild1Çfld2Çfld3….. Fild1Çfld2Çfld3….. Fild1Çfld2Çfld3….. Fild1Çfld2Çfld3….. Fild1Çfld2Çfld3….. . . N records (140000) eg GRPDATA (Reference file) (2 Replies)
Discussion started by: hyperion.krish
2 Replies

9. Shell Programming and Scripting

Copying multiple csv files

Hi, I have mutiple csv files at server1 at /apps/test/data. I needed a script that would copy these csv files from server1 at /usr/data, put them in server2,archive the earlier files that were present in server2 before removing those already present. Kindly help. (2 Replies)
Discussion started by: Alok Ranjan
2 Replies

10. Shell Programming and Scripting

How to Pull out multiple files from DB table and redirect all those files to a differetn directory?

Hi everyone!! I have a database table, which has file_name as one of its fields. Example: File_ID File_Name Directory Size 0001 UNO_1232 /apps/opt 234 0002 UNO_1234 /apps/opt 788 0003 UNO_1235 /apps/opt 897 0004 UNO_1236 /apps/opt 568 I have to... (3 Replies)
Discussion started by: ss3944
3 Replies
Login or Register to Ask a Question