...
So what would be the appropriate way of calling E4:G4:F4 -> V4
...
You form 'E4' first and get its cell value.
Then you form 'G4' and get its cell value.
And then do the same for 'F4'.
Then you form "Scores" namedtuple using the cell values of 'E4', 'G4' and 'F4'.
Then you check if that "Scores" namedtuple is a key in dictionary "dpos".
If it is, then get its value from "dpos" and paste in cell 'V4'; otherwise paste the "Unknown" string in 'V4'.
So, given:
...
...
What do you mean by point 2 ? Will.it then be
dpos = get_text_data(txt_filename)
That is closely related to point 1.
Were you able to implement point 1 first?
The "function call" mentioned in point 1 looks the same as "function signature" mentioned in point 2.
Or looking at it another way; in the following statement:
Code:
dpos = get_text_data(txt_filename)
(1) what is the name of the function? and
(2) what is the name of the parameter?
Now, if you remove the parameter, then what will the above statement look like? (Have a look at the tutorial; it tells you what a function parameter is.)
This User Gave Thanks to durden_tyler For This Post:
Thank you. Its just that this language is quite new and I am happy to learn it (just a bit slow!) So based on all your inputs, I have modified the code a bit more, now it prints the correct score based on 3 columns on the command line but
a) does not saves it to the excel sheet and
b) it does not read beyond row 4
So the code now is
Code:
#!/usr/bin/python
import sys
import os
from openpyxl import load_workbook
from datetime import datetime
from pandas import read_table
import csv
from collections import namedtuple
import csv
# Variables
sheet_directory = r'/home/test'
def process_xl_sheets():
dict_pos = {}
Scores = namedtuple("Scores", ["POS", "ALT", "REF"])
first_line = True
with open('/home/test/scores.txt') as txt_filename:
for line in txt_filename:
if first_line:
first_line = False
continue
line = line.rstrip('\n')
x = line.split('\t')
cpos = Scores(POS=x[0], ALT=x[2], REF=x[1])
dict_pos[cpos] = x[3]
#print dict_pos ##prints
for sheet_root, sheet_dirs, sheet_files in os.walk(sheet_directory):
for sheet_file in sheet_files:
if sheet_file.endswith('.xlsx'):
#print(sheet_file) ##prints
sheet_xl_file = os.path.join(sheet_root, sheet_file)
wb = load_workbook(sheet_xl_file, data_only=True)
ws = wb.get_sheet_by_name('raw_data')
pos_col_no = 'E'
alt_col_no = 'G'
ref_col_no = 'F'
score_col_no = 'V'
NG = score_col_no
row_no = 4
# cell = ws[pos_col_no + str(row_no)]
cell_pos = ws[pos_col_no + str(row_no)]
cell_ref = ws[ref_col_no + str(row_no)]
cell_alt = ws[alt_col_no + str(row_no)]
NG = ws[score_col_no + str(row_no)] = 'Unknown_' + datetime.now().strftime("%B")
cpos = Scores(POS=cell_pos.value, ALT=cell_alt.value, REF=cell_ref.value)
print cpos ##prints only row E4:G4:F4 and not all
#print dict_pos ##prints all the scores
if (dict_pos.has_key(cpos)):
NG = dict_pos[cpos]
print dict_pos[cpos] ##prints the actual score
else:
NG = 'Unknown_' + datetime.now().strftime("%B") + datetime.now().strftime("%Y")
row_no += 1
cpos = Scores(POS=cell_pos.value, ALT=cell_alt.value, REF=cell_ref.value)
wb.save(sheet_xl_file)
process_xl_sheets()
and the command line results are
Code:
/usr/bin/python2.7 /home/test/new_dict.py
##reads only row 4
Scores(POS=u'73', ALT=u'G', REF=u'A')
##prints all values from 'score.txt'{Scores(POS='2171', ALT='C', REF='T'): '5', Scores(POS='73', ALT='G', REF='A'): '11', Scores(POS='114', ALT='T', REF='C'): '1', Scores(POS='2080', ALT='C', REF='T'): '4', Scores(POS='1189', ALT='C', REF='T'): '1'}
##returns the value from score.txt against row 4 of xlsx but does not save it to excel
11 Process finished with exit code 0
...
now it prints the correct score based on 3 columns on the command line but
a) does not saves it to the excel sheet and
b) it does not read beyond row 4
...
...
b) You did not tell Python to go beyond row 4! That's why it did not read beyond row 4.
The earlier iterations of this program had the "while" statement in them, which you removed. That "while" statement was for going beyond row 4.
What you want to do is something like the following pseudo-code:
Code:
Go to row 4.
Get the values of E4, G4, F4 and cpos.
If cpos exists as a key in dict_pos, then print the value otherwise print the "unknown" string in cell V4.
Go to row 5.
Get the values of E5, G5, F5 and cpos.
If cpos exists as a key in dict_pos, then print the value otherwise print the "unknown" string in cell V5.
Go to row 6.
Get the values of E6, G6, F6 and cpos.
If cpos exists as a key in dict_pos, then print the value otherwise print the "unknown" string in cell V6.
Go to row 7.
Get the values of E7, G7, F7 and cpos.
If cpos exists as a key in dict_pos, then print the value otherwise print the "unknown" string in cell V7.
...
Keep doing this until you reach a row whose E, G and F columns are empty.
This User Gave Thanks to durden_tyler For This Post:
I did use the while statement before but that just printed the value of the score continously without stopping. This is what I did
Code:
NG = ws[score_col_no + str(row_no)] = 'Unknown_' + datetime.now().strftime("%B")
cpos = Scores(POS=cell_pos.value, ALT=cell_alt.value, REF=cell_ref.value)
print cpos ##prints only row E4:G4:F4 and not all
while cpos:
if (dict_pos.has_key(cpos)):
NG = dict_pos[cpos]
print dict_pos[cpos] ##prints the actual score
else:
NG = 'Unknown_' + datetime.now().strftime("%B") + datetime.now().strftime("%Y")
row_no += 1
cpos = Scores(POS=cell_pos.value, ALT=cell_alt.value, REF=cell_ref.value)
wb.save(sheet_xl_file)
process_xl_sheets()
and there was no value on the excel sheet saved and also the code did not stop
Code:
/usr/bin/python2.7 /home/test/new_dict.py
##reads only row 4
Scores(POS=u'73', ALT=u'G', REF=u'A')
##prints all values from 'score.txt'
{Scores(POS='2171', ALT='C', REF='T'): '5', Scores(POS='73', ALT='G', REF='A'): '11', Scores(POS='114', ALT='T', REF='C'): '1', Scores(POS='2080', ALT='C', REF='T'): '4', Scores(POS='1189', ALT='C', REF='T'): '1'}
##returns the value from score.txt against row 4 of xlsx but does not save it to excel 1111
11
11
11
11
1111
11
....
...
...
I did use the while statement before but that just printed the value of the score continously without stopping. ...
That's because the "while" statement went into what is called an "infinite loop".
That's because "cpos" is always True everytime it is checked.
[ It means that "cpos" always has a value everytime it is checked and that value is Scores(POS='73', ALT='G', REF='A') ]
You want "cpos" to change in every iteration of the "while" loop.
And you want the row_no to increment in every iteration of the "while" loop.
You are incrementinng row_no and changing cpos within the else branch.
You need to increment row_no and change cpos within the while statement.
In Python, the indentation of the statement decides what is done "within" what.
---------- Post updated at 02:12 PM ---------- Previous update was at 10:33 AM ----------
Here's how your code is being executed; I have added line numbers at the left to help track the path of execution:
Code:
1 NG = ws[score_col_no + str(row_no)] = 'Unknown_' + datetime.now().strftime("%B")
2 cpos = Scores(POS=cell_pos.value, ALT=cell_alt.value, REF=cell_ref.value)
3 print cpos ##prints only row E4:G4:F4 and not all
4 while cpos:
5 if (dict_pos.has_key(cpos)):
6 NG = dict_pos[cpos]
7 print dict_pos[cpos] ##prints the actual score
8 else:
9 NG = 'Unknown_' + datetime.now().strftime("%B") + datetime.now().strftime("%Y")
10 row_no += 1
11 cpos = Scores(POS=cell_pos.value, ALT=cell_alt.value, REF=cell_ref.value)
12 wb.save(sheet_xl_file)
Line 1 => Set the value of NG to 'Unknown_June'.
Line 2 => Set the value of cpos to Scores(POS='73', ALT='G', REF='A')
Line 3 => Print the value of cpos
Line 4 => Is cpos True? Yes it is.
Line 5 => Does dict_pos have the key cpos = Scores(POS='73', ALT='G', REF='A')? Yes it does.
Line 6 => Set the value of NG to dict_pos[cpos] = 11
Line 7 => Print the value of dict_pos[cpos] = 11
Line 8 => Disregard the "else" branch and all 3 statements within it (lines 9, 10, 11) because "if" branch was processed. The control goes back to Line 4.
Line 4 => Is cpos True? Yes it is.
Line 5 => Does dict_pos have the key cpos = Scores(POS='73', ALT='G', REF='A')? Yes it does.
Line 6 => Set the value of NG to dict_pos[cpos] = 11
Line 7 => Print the value of dict_pos[cpos] = 11
Line 8 => Disregard the "else" branch and all 3 statements within it (lines 9, 10, 11) because "if" branch was processed. The control goes back to Line 4.
Line 4 => Is cpos True? Yes it is.
Line 5 => Does dict_pos have the key cpos = Scores(POS='73', ALT='G', REF='A')? Yes it does.
Line 6 => Set the value of NG to dict_pos[cpos] = 11
Line 7 => Print the value of dict_pos[cpos] = 11
Line 8 => Disregard the "else" branch and all 3 statements within it (lines 9, 10, 11) because "if" branch was processed. The control goes back to Line 4.
Line 4 => Is cpos True? Yes it is.
Line 5 => Does dict_pos have the key cpos = Scores(POS='73', ALT='G', REF='A')? Yes it does.
Line 6 => Set the value of NG to dict_pos[cpos] = 11
Line 7 => Print the value of dict_pos[cpos] = 11
Line 8 => Disregard the "else" branch and all 3 statements within it (lines 9, 10, 11) because "if" branch was processed. The control goes back to Line 4.
Line 4 => Is cpos True? Yes it is.
Line 5 => Does dict_pos have the key cpos = Scores(POS='73', ALT='G', REF='A')? Yes it does.
Line 6 => Set the value of NG to dict_pos[cpos] = 11
Line 7 => Print the value of dict_pos[cpos] = 11
Line 8 => Disregard the "else" branch and all 3 statements within it (lines 9, 10, 11) because "if" branch was processed. The control goes back to Line 4.
...
...
...
And so on, ad infinitum...
...
...
You have to give a chance for Line 10 and 11 to execute.
This User Gave Thanks to durden_tyler For This Post:
I'm not quite sure what is going wrong, I have tried this as well
Code:
while cpos:
if (dict_pos.has_key(cpos)):
NG = dict_pos[cpos]
print dict_pos[cpos]
else:
NG = 'Unknown_' + datetime.now().strftime("%B") + datetime.now().strftime("%Y")
row_no += 1
cpos = Scores(POS=cell_pos.value, ALT=cell_alt.value, REF=cell_ref.value)
wb.save(sheet_xl_file)
it still goes into an infinite loop.
Also, why does it not save '11' in the excel workbook for the first row and instead prints 'unknown' even though it shows the correct score on the terminal?
I'm not quite sure what is going wrong, I have tried this as well
Code:
while cpos:
if (dict_pos.has_key(cpos)):
NG = dict_pos[cpos]
print dict_pos[cpos]
else:
NG = 'Unknown_' + datetime.now().strftime("%B") + datetime.now().strftime("%Y")
row_no += 1
cpos = Scores(POS=cell_pos.value, ALT=cell_alt.value, REF=cell_ref.value)
wb.save(sheet_xl_file)
it still goes into an infinite loop.
Also, why does it not save '11' in the excel workbook for the first row and instead prints 'unknown' even though it shows the correct score on the terminal?
Print the value of "cpos" inside the "while" loop.
It is the same every time.
That's the reason for the infinite loop.
You need the value of "cpos" to change in every iteration of the loop.
This User Gave Thanks to durden_tyler For This Post:
Source Code of the original script is down below please run the script and try to solve this problem
this is my data and I want it column wise
2019-03-20 13:00:00:000
2019-03-20 15:00:00:000
1
Operating System
LAB
0
1
1
1
1
1
1
1
1
1
0
1 (5 Replies)
Hi experts -
I'm relatively new to python, but I have an requirement to automate getting a file from a WebLib server using an API.
The file I'm requesting from this sever is an excel spreadsheet (.xlsx).
I get a valid response back via an xml doc from the server.
In this xml file I get... (8 Replies)
Hi,
I have a requirement to append = in particular row in csv file. Data in csv is as follow:
row1,a,a,a
row2,b,b,b
row3,c,c,c
row4,d,d,d
csv should be modified at row3 and no. of columns are not fixed but rows are. output should be as:
row1,a,a,a
row2,b,b,b
row3,=c,=c,=c... (2 Replies)
Hi All,
Input.txt
KGO
Id "003"
..........
..........
Par "CPara"
BIN RECGET
Name "DIR_PATH"
Prompt "DIR_PATH"
END RECGET
KGO
............
..........
...............
KGO
Id "077"
..........
.......... (7 Replies)
I have a text file where I want to append a column of numbers in ascending orders.
Input:
57 abc
25 def
32 ghi
54 jkl
Output:57 abc
57 abc 1
25 def 2
32 ghi 3
54 jkl 4
How do I go about doing that? Thanks! (11 Replies)
Hi, i want to add another column to existing files containing strings and need to have the final output as a csv file. i have quite a number of files, each with varying number of rows and i need to append the string "test" for all the valid rows for each file. my sample raw files looks like this... (8 Replies)
Hi ,
I have a file with a running sequence number. I need to append a date value mmdd format on to the first column.
for e.g.: The file contains records as
001 abc
002 cde
003 edf
004 fgh
005 hik
The output should be
1111001 abc
1111002 cde
1111003 edf
1111004 ... (1 Reply)
Hi all,
I have two files with the same number of lines
the first file is a.dat and looks like
0.000 1.000
1.000 2.000
...
the fields are tab separated
the second file is b.dat and looks like
1.2347 0.546
2.3564 0.321
...
the fields are tab separated
I would like to have a file c.dat... (4 Replies)
Hi , I have the below file with 6 columns.I want to append 'File1' as the 1 column to the file. i have the sample code .It is not working . can u please correct this or make new one .....
awk 'print {'File1',$1,$2,$3,$4,$5,$6}' Source_File> Result_File
Source_File:... (6 Replies)
It appears that this has been asked and answered in similar fashions previously, but I am still unsure how to approach this.
I have two files containing user information:
fileA
ttim:/home/ttim:Tiny Tim:632
ppinto:/home/ppinto:Pam Pinto:633
fileB
ttim:xkfgjkd*&#^jhdfh... (3 Replies)