Read .xlsx and text files in a loop using openpyxl


 
Thread Tools Search this Thread
Top Forums Programming Read .xlsx and text files in a loop using openpyxl
# 1  
Old 06-02-2017
Read .xlsx and text files in a loop using openpyxl

I have a list of .xlsx files (names 1.xlsx, 2.xlsx etc) in a directory, on which I need to insert data from its corresponding text file (named 1.txt, 2.txt etc) on the second worksheet named 'Filtered' and save it. The code I am trying is

Code:
#!/usr/bin/python

import os
from openpyxl.reader.excel import load_workbook
from openpyxl.drawing.image import Image
import csv

directoryPath = r'/REPORTS/'
os.chdir(directoryPath)
folder_list = os.listdir(directoryPath)
for folders, sub_folders, file in os.walk(directoryPath):  
    for name in file:
        if name.endswith(".xlsx"):
            filename = os.path.join(folders, name)
            wb = load_workbook(filename, data_only=True)
            ws = wb.get_sheet_by_name('Filtered')

##here is where the text files are stored
            directory = os.path.join("/REPORTS/temp", "path")
            for root,dirs,files in os.walk(directory):
               for file in files:
                    if file.endswith(".txt"):
                        f = open(textfile, 'r')
                        reader = csv.reader(f, delimiter='\t')
                        for row in reader:
                           ws.append(row)
                           wb.save(filename)

Although the code goes through all excel files, it is not going through the text files to write its corresponding result onto the 2nd worksheet of each excel file. Any suggestions to edit the code ?
# 2  
Old 06-02-2017
Quote:
Originally Posted by nans
...
Code:
#!/usr/bin/python

import os
from openpyxl.reader.excel import load_workbook
from openpyxl.drawing.image import Image
import csv

directoryPath = r'/REPORTS/'
os.chdir(directoryPath)
folder_list = os.listdir(directoryPath)
for folders, sub_folders, file in os.walk(directoryPath):  
    for name in file:
        if name.endswith(".xlsx"):
            filename = os.path.join(folders, name)
            wb = load_workbook(filename, data_only=True)
            ws = wb.get_sheet_by_name('Filtered')

##here is where the text files are stored
            directory = os.path.join("/REPORTS/temp", "path")
            for root,dirs,files in os.walk(directory):
               for file in files:
                    if file.endswith(".txt"):
                        f = open(textfile, 'r')
                        reader = csv.reader(f, delimiter='\t')
                        for row in reader:
                           ws.append(row)
                           wb.save(filename)

Although the code goes through all excel files, it is not going through the text files to write its corresponding result onto the 2nd worksheet of each excel file. Any suggestions to edit the code ?
You are not opening the correct text file. That's the reason your script doesn't work.
Keep in mind that given the Excel file "1.xlsx", you want to open "1.txt".
Here, "1" is the keyword.
As long as the files "N.xlsx" and "N.txt" (where N = 1, 2, 3, ... etc.) are linked via "N", you can exploit this feature in your code as follows:

Code:
import os
from openpyxl.reader.excel import load_workbook
from openpyxl.drawing.image import Image
import csv

xl_directory = r'<path_to_Excel_directory>'
txt_directory = r'<path_to_text_directory>'

for xl_root, xl_dirs, xl_files in os.walk(xl_directory):
    for xl_file in xl_files:
        if xl_file.endswith('.xlsx'):
            xl_abs_file = os.path.join(xl_root, xl_file)
            wb = load_workbook(xl_abs_file, data_only = True)
            ws = wb.get_sheet_by_name('Filtered')
  
            # Now search for the corresponding text file.
            # Note: if xl_file = '1.xlsx', then xl_file.replace('xlsx', 'txt')
            # returns '1.txt' - the corresponding text file name. If this doesn't
            # hold true, then a different algorithm will be required.
            for txt_root, txt_dirs, txt_files in os.walk(txt_directory):
                for txt_file in txt_files:
                    if txt_file == xl_file.replace('xlsx', 'txt'):
                        fh = open(txt_file, 'r')
                        reader = csv.reader(fh, delimiter = '\t')
                        for row in reader:
                            ws.append(row)
                            wb.save(xl_abs_file)

These 2 Users Gave Thanks to durden_tyler For This Post:
# 3  
Old 06-02-2017
Thank you so much. I did try the code edits but it still does not write the text output in the .xlsx worksheet desired.

Also, I get a warning msg (which I think can be ignored?) My excel and text files are named similarly with its respective extensions

Code:
/usr/local/lib/python2.7/site-packages/openpyxl/reader/worksheet.py:310: UserWarning: Unknown extension is not supported and will be removed
  warn(msg)


Last edited by nans; 06-02-2017 at 06:15 PM..
# 4  
Old 06-03-2017
If you do a Google search for the string: "openpyxl UserWarning: Unknown extension is not supported and will be removed", you will see this issue being reported by a lot of users.
In the first search result from StackOverflow, a very good explanation is given about this warning. The extension that they talk about is not a file extension, but "features" added to the Excel specification that openpyxl does not handle.

Keep in mind that openpyxl is an open-source module and is still under development.
And it needs to keep pace with newer extensions in newer versions of Excel specification.

The more concerning fact is that Python did not write anything to your Excel file.
(a) Could you post all versions that you are using?
(b) Also, could you add print statements to track the execution and determine where exactly it fails?
Does it find the Excel files and text files and csv-read the text files and then fail at wb.append() or wb.save()? Does it fail before that?

I tried the posted code in two different laptops with following software versions:

Code:
Laptop 1:
Pythonxy installation on Windows 7 64-bit (http://python-xy.github.io/)

1)  Python 2.7.10
2)  openpyxl - don't know the version
3)  Excel - Microsoft Excel 2013

Laptop 2:
Anaconda installation on Debian 8 64-bit (https://docs.continuum.io/anaconda/install-linux.html)

1)  Python 3.6.0 on Anaconda 4.3.1 (64-bit)
2)  openpyxl version 2.4.1
3)  Libreoffice Calc 4.3.3.2

In both cases, I created fresh new text and Excel/Libreoffice files with dummy data in text files and empty worksheets named "Filtered".
The Libreoffice file was saved as "Microsoft Excel 2007/2010/2013 XML (.xlsx)"

I did not see any warning in either case.
The data was written successfully to the "Filtered" worksheet in both cases.

---------- Post updated at 09:58 AM ---------- Previous update was at 09:42 AM ----------

Just remembered one more thing.
Ensure that all text and Excel files are closed when you run your Python script.
In Laptop 1, I remember seeing an error message since Microsoft Excel locks all open files and hence Python cannot write to it.
(This file-locking is a very well-known fact which you probably know, but I thought of mentioning it in any case.)
# 5  
Old 06-05-2017
Hello,
Sorry for the delayed response. yes, all files are closed before running the commands.

These are the versions I am using

1) Python 2.7.6
[GCC 4.8.2] on linux/Ubuntu 3.13.0-24-generic 64 bit
2) openpyxl 2.5.0a1
3) Libre office version 4.2.4.2

*Update* Now when I run the code, it works! I replaced the following

Code:
 
fh = open(txt_file, 'r')

to
Code:
 
with open (os.path.join(txt_root, txt_file)) as fh:

Just another quick question. The original excel files have some images/fonts/styling but using openpyxl and saving the *new* excel file make all the images and styling go away. Is there a way around to keep the original excel file intact ?

Many thanks for your support.

Last edited by nans; 06-05-2017 at 07:18 AM.. Reason: suggested code works
# 6  
Old 06-05-2017
I am unable to replicate your issue.
Can you post a minimal Excel file, minus all personal/sensitive information, on which you face this issue?
# 7  
Old 06-06-2017
I have uploaded the 1)xlsx file 2) text file 3)script. So the images are only the first two worksheets. After the script has run, the images disappear.

Thank you

Last edited by nans; 06-07-2017 at 04:35 AM..
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Read loop from two files using descriptors

What I would like to do is read each line in the atdinfile: A sample atdinfile would look like this: 651 652 653 654 655 656 657 658 659 660 661 664 665 666 667 668 (5 Replies)
Discussion started by: woodson2
5 Replies

2. Shell Programming and Scripting

Read text between regexps and write into files based on a field in the text

Hi, I have a huge file that has data something like shown below: huge_file.txt start regexp Name=Name1 Title=Analyst Address=Address1 Department=Finance end regexp some text some text start regexp Name=Name2 Title=Controller Address=Address2 Department=Finance end regexp (7 Replies)
Discussion started by: r3d3
7 Replies

3. Shell Programming and Scripting

Read n lines from a text files getting n from within the text file

I dont even have a sample script cause I dont know where to start from. My data lookes like this > sat#16 #data: 15 site:UNZA baseline: 205.9151 0.008 -165.2465 35.8109 40.6685 21.9148 121.1446 26.4629 -18.4976 33.8722 0.017 -165.2243 48.2201 40.6908 ... (8 Replies)
Discussion started by: malandisa
8 Replies

4. Shell Programming and Scripting

For loop inside awk to read and print contents of files

Hello, I have a set of files Xfile0001 - Xfile0021, and the content of this files (one at a time) needs to be printed between some line (lines start with word "Generated") that I am extracting from another file called file7.txt and all the output goes into output.txt. First I tried creating a for... (5 Replies)
Discussion started by: jaldo0805
5 Replies

5. Shell Programming and Scripting

Perl :Is it possible to read the excel 2007 sheet on unix machine using spredsheet::xlsx module

I have an Excel 2007 excel sheet on windows machine and using Spreadsheet::XLSX I had written a script to read the excel sheet and was successful. My requirement is I need to generate another excel sheet from the old excel 2007 sheet on unix machine. Now is it possible to read the excel... (2 Replies)
Discussion started by: giridhar276
2 Replies

6. UNIX for Advanced & Expert Users

How to read a text file and assign the values in the same to a variable in loop

Hi, I have a text file with multiple lines, each having data in the below format <DOB>,<ADDRESS> I have to write a script which reads each line in the text file in loop, assign the values to these variables and do some further processing in it. Using the following code prints the values... (12 Replies)
Discussion started by: manishab00
12 Replies

7. Fedora

How to read a text file and assign the values in the same to a variable in loop

Hi, I have a text file with multiple lines, each having data in the below format <DOB>,<ADDRESS> I have to write a script which reads each line in the text file in loop, assign the values to these variables and do some further processing in it. Using the following code prints the... (1 Reply)
Discussion started by: manishab00
1 Replies

8. Shell Programming and Scripting

Text to xlsx conversion

Hi Guys, The code snippet below is working fine with xls 2003 but the same is not working in xlsx 2007. Can someone please help. #Convert .txt file format to .xls format awk 'BEGIN{ FS="|" cols=8 print "<HTML><style type=\"text/css\">td{mso-number-format:\"\@\"}</style><BODY><TABLE... (1 Reply)
Discussion started by: icefish
1 Replies

9. Shell Programming and Scripting

Read and edit multiple files using a while loop

Hi all, I would like to simply read a file which lists a number of pathnames and files, then search and replace key strings using a few vi commands: :1,$s/search_str/replace_str/g<return> but I am not sure how to automate the <return> of these vis commands when I am putting this in a... (8 Replies)
Discussion started by: cyberfrog
8 Replies

10. Shell Programming and Scripting

help with a 'while read' loop to change the names of files

Hi, I am new to scripting, so any help on this would be much appreciated. I am trying to rename a bunch of files, taking the names sequentially from a list read in another file... # ls oldnames file_1 file_2 file_3 # cat names red yellow green I want the files to take on the... (6 Replies)
Discussion started by: starsky
6 Replies
Login or Register to Ask a Question